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Appellants respectfully disagree with the Examiner's statement in Paper No. 20 5 page 2, 
that Appellants' Brief "does not include a statement that this grouping of claims does not stand or 
fall together and reasons in support thereof." Such a statement may be found on page 8 of 
Appellants' Brief. 

Argument 

Appellants respectfully request that the rejection of claims 1, 3-19, and 25-26 be 
withdrawn or reversed, and that the instant claims be permitted to proceed to allowance, because 
the Examiner has failed to establish a prima facie case of obviousness. The Examiner's Answer 
to Appellants' Appeal Brief only serves to emphasize that the Examiner continues to selectively 
consider the cited publications, ignoring those portions that clearly indicate the flaws in the 
Examiner's asserted prima facie case. Furthermore, the Examiner continues to rely on hindsight 
in support of the rejection, and has failed to provide any particular findings as to the reason that 
the skilled artisan, with no knowledge of the claimed invention, would have selected the 
components that the Examiner seeks to combine. 

As discussed in detain in Appellants' Appeal Brief, the Examiner incorrectly contends 
that it allegedly would have been obvious to select three amino acids - Arg-Pro-Arg - from the 
much larger Hin recombinase protein (disclosed in the Feng et al. publication) for combination 
with a class of polyamide molecules based on N-methylpyrrole and N-methylimidazole 
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(disclosed in the Swalley, Parks, and Trauger publications) in order to provide the instantly 
claimed invention. Appellants described why the Arg-Pro-Arg sequence selected in isolation by 
the Examiner is neither chemically nor functionally similar to such pyrrole- and imidazole-based 
polyamides disclosed in the Swalley, Parks, and Trauger publications. Moreover, Appellants also 
described why the skilled artisan would readily acknowledge that the Arg-Pro-Arg sequence in 
isolation does not possess any DNA binding characteristics. Rather, it is only in the context of a 
much larger and well known DNA binding motif that the "Arg-Pro-Arg" sequence selected by 
the Examiner has any DNA binding characteristics whatsoever. 

Therefore, Appellants pointed out that one of ordinary skill in the art at the time the 
instant application was filed would have lacked any motivation to select the Arg-Pro-Arg 
sequence from a much larger protein structure, in order to graft that sequence onto a pyrrole- and 
imidazole-based polyamide to provide the instantly claimed invention; and that one of ordinary 
skill in the art would have also lacked a reasonable expectation that grafting an Arg-Pro-Arg 
sequence onto such polyamides would have any effect whatsoever on the ability of such 
polyamides to bind DNA. 

The Examiner continues to fail to consider the teachings of cited publications in their entirety 
with the knowledge available to the skilled artisan at the time the instant application was filed 

In attempting to rebut the arguments made by Appellants in their Appeal Brief, the 
Examiner continues to selectively consider only certain portions of the cited publications, while 
ignoring those portions that clearly demonstrate the fallacy of the Examiner's rebuttal. For 
example, Appellants discussed in detail why the skilled artisan would clearly understand from 
the cited Feng et al publication that it is only in the context of the larger Hin recombinase 
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protein that the Arg-Pro-Arg sequence exhibit any DNA binding properties whatsoever, and that 
the skilled artisan would not have excised these three residues from the much larger Hin 
recombinase protein for grafting onto a pyrrole- or imidazole-based polyamide. The Examiner's 
statement in reply that "the Feng et al. reference clearly states that the 'Arg-Pro-Arg' sequence 
maintains its ability to interact with the minor groove of DNA in a specific manner when 
positioned in an entirely different protein sequence, namely the Engrailed protein of Drosophila" 
(Paper No. 20, page 6) fails to consider that the Engrailed protein and the Hin recombinase 
protein are not "entirely different," as the Examiner contends. Instead, each of these two proteins 
rely on the very same well known structurally conserved helix-turn-helix DNA binding motif to 
provide specific binding to DNA. 

As described in the Feng et al publication on page 348, right column, specific binding of 
DNA by Hin recombinase "requires both major groove interactions involving a helix-turn-helix 
(HTH) a-helix motif and minor groove interactions involving the sequence Gly 139 -Arg 140 - Pro 141 - 
Arg 142 " (emphasis added). The helix-turn-helix motif is known by those of skill in the art to 
provide sequence specific DNA binding. See, e.g., The Dictionary of Cell Biology, 2 nd Ed. 
(Lackie and Dow, 1995) (helix-turn-helix: "A motif associated with transcription factors, 
allowing them to bind to and recognize specific DNA sequences"). As shown in Figure 3 of the 
Feng et al publication, the helix-turn-helix motif serves to present the Gly 139 -Arg 140 - Pro 141 - 
Arg 142 sequence in precisely the proper orientation to interact with the DNA molecule. 

Similarly, Engrailed is a homeodomain protein that contains the very same helix-turn- 
helix motif as Hin recombinase. See, Feng et al, Figure 10 (showing engrailed homeodomain); 
see also, The Penguin Dictionary of Biology, 10 th Ed. (2000) (homeobox: "Conserved DNA 
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sequence encoding DNA binding regions of many proteins, [including] homeodomains. . . 
[conferring] the helix-turn-helix motif upon a protein, giving it its DNA binding properties"). As 
shown in Figure 10 of the Feng et al publication, the helix-turn-helix motif of Hin recombinase 
and Engrailed interact with DNA in a highly conserved fashion. 

Thus, based on the well understood structure and function of the helix-turn-helix motif in 
proteins such as Hin recombinase and Engrailed, the skilled artisan would understand that it is 
only in the context of the strikingly similar 3 -dimensional structure of these proteins that the Arg- 
Pro-Arg sequence selected by the Examiner exhibits any DNA binding characteristics 
whatsoever. The Examiner's contention that "the prior art has provided two examples wherein 
the sequence ' Arg-Pro-Arg' is instrumental in providing minor groove specific binding to DNA" 
(Paper No. 20, page 8) is absurdly reductionist, ignoring the context in which the skilled artisan 
would view the Feng et al publication. Indeed, taken to its logical conclusion, the Examiner's 
argument would entitle the Examiner to assert that even a single amino acid in a protein is 
"instrumental" to the function of that protein. 

The Examiner has steadfastly ignored the fact that, as recognized in the Feng et al 
publication, DNA binding requires a large number of contacts, both within the protein itself (e.g., 
formation of the helix-turn-helix motif), and between the protein and DNA {e.g., in both in both 
the major and minor DNA groove), and is not conferred by a single 3-residue peptide. Nothing in 
the purported prima facie case would indicate to one of skill in the art that, in the absence of this 
helix-turn-helix motif, the "Arg-Pro-Arg" sequence would exhibit any DNA binding properties 
whatsoever, and nothing of record indicates that this helix-turn-helix motif would exist in the 
pyrrole- and imidazole-based polyamides of the Swalley, Parks, and Trauger publication. 
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Therefore, the skilled artisan would not be motivated to combine an Arg-Pro-Arg sequence to 
pyrrole- and imidazole-based polyamides in order to provide the instantly claimed invention, or 
have a reasonable expectation that grafting the Arg-Pro-Arg sequence onto a pyrrole- or 
imidazole-based polyamide would have any effect whatsoever on the ability of such polyamides 
to bind DNA. 

The Examiner also fails to provide any reasons why the skilled artisan, with no knowledge of the 
claimed invention, would have selected the components selected by the Examiner 

Furthermore, in the face of a clear statement in the Feng et al. publication that specific 
binding of DNA by Hin recombinase " requires both major groove interactions involving a helix- 
turn-helix (HTH) a-helix motif and minor groove interactions involving the sequence Gly 139 - 
Arg 140 - Pro 141 -Arg 142 ," the Examiner has provided no evidence to support the contention that the 
skilled artisan would allegedly be motivated to select the particular three amino acid segment 
chosen by the Examiner in isolation from the much larger 52 amino acid Hin recombinase DNA 
binding domain. See, In re Dance, 48 USPQ2d 1635, 1637 (Fed. Cir. 1998) (there must be some 
motivation, suggestion, or teaching of the desirability of making the specific combination that 
was made by applicant). 

For example, while the Feng et al. publication provides no information about the Arg- 
Pro-Arg sequence selected in isolation by the Examiner, the publication does state that the Gly 
residue in the sequence Glyng-Arguo is essential for sequence specific DNA binding by Hin 
recombinase. See, Feng et al., page 348, column 3. Why then would the skilled artisan be 
motivated to select the Arg-Pro-Arg sequence without the Gly residue? The Examiner is silent on 
this point. 
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Similarly, the Feng et aL publication also states that a-helix 3, which is located at the 
carboxyl-terminal end of the DNA binding domain, is the DNA recognition helix for the Hin 
recombinase protein. Feng et aL, page 350, left column; Fig. 1, p. 348. Why would the skilled 
artisan be motivated to select the Arg-Pro-Arg sequence and not the sequence in a-helix 3? 
Again, the Examiner is silent. 

Additionally, the Feng et aL publication states that the Glyi39-Argi 4 o-Proi 4 i-Argi 4 2 
sequence adopts an extended conformation to present the 4 amino acid sequence in a proper 
manner to bind DNA, and that Ilei44 is critical in maintaining this orientation. Feng et aL, page 
351, columns 2-3. Why would the skilled artisan be motivated to select the Arg-Pro-Arg 
sequence and not a sequence that includes this He residue? Again, silence by the Examiner. 

Moreover, as discussed above, the skilled artisan understands that the helix-turn-helix 
motif itself can confer sequence-specific DNA binding properties on a protein. Why would the 
skilled artisan be motivated to select the Arg-Pro-Arg sequence and not all or a portion of the 
helix -turn-helix motif? Again, silence. 

Finally, the skilled artisan also understands that there are numerous DNA binding 
proteins that use a motif other than the helix-turn-helix motif, such as the helix-loop-helix motif, 
the leucine zipper motif, and the zinc finger motif. See, e.g., Dictionary of Cell Biology 
definitions of each of these motifs. Why would the skilled artisan be motivated to zero in on only 
the helix-turn-helix proteins, such as Hin recombinase, at all? The Examiner provides no 
explanation. 

Viewed in this light, it is apparent that the only means by which the Examiner has 
selected the Arg-Pro-Arg sequence is by hindsight, "decomposing an invention into its 
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constituent elements, finding each element in the prior art, and then claiming that it is easy to 
reassemble these elements into the invention." In reMahurkar, 28 USPQ2d 1801, 1817 (N.D. 111. 
1993). The Court of Appeals for the Federal Circuit ("CAFC") has repeatedly counseled that "the 
best defense against the subtle but powerful attraction of a hindsight-based obviousness analysis 
is rigorous application of the requirement for a showing of the teaching or motivation to combine 
prior art references." In re Dembiczak, 50 USPQ2d 1614, 1617 (Fed. Cir. 1999). 

The Examiner's statement that "any judgement on obviousness is in a sense necessarily a 
reconstruction based upon hindsight reasoning" (Paper No. 20, page 9) exhibits a complete lack 
of understanding by the Examiner that "particular findings must be made as to the reason the 
skilled artisan, with no knowledge of the claimed invention, would have selected these 
components for combination in the manner claimed " in order to support an obviousness 
rejection. In re Kotzab, 55 USPQ2d 1313, 1317 (Fed. Cir. 2000) (emphasis added). Because the 
Examiner has failed to provide such particular findings, no prima facie case of obviousness has 
been established. 

In contradiction to the Examiner's contention, the Arg-Pro-Arg sequence excised by the 
Examiner from Hin recombinase is not functionally or structurally similar to the polyamides 
disclosed in the Swalley, Parks, and Trauger publications 

Furthermore, the Examiner's axiomatic statement that "it is prima facie obvious to 
combine two compositions each of which is taught by the prior art to be useful for the same 
purpose" (Paper No. 20, page 8) can provide no solace to the Examiner. As discussed above, the 
Examiner's contention that "the prior art has provided two examples wherein the sequence ' Arg- 
Pro-Arg' is instrumental in providing minor groove specific binding to DNA" is reductionist in 
the extreme, and would logically lead to the absurd conclusion that even a single amino acid in 
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the Hin recombinase structure is "instrumental" to DNA binding (and apparently that the skilled 
artisan would be motivated to select it over any other residue or combination of residues in the 
protein). 

In contrast to the Arg-Pro-Arg sequence selected by the Examiner, the pyrrole- and 
imidazole-based polyamides disclosed in the Swalley, Parks, and Trauger publications were 
actually demonstrated to specifically bind to DNA in and of themselves. Thus, in contradiction to 
the Examiner's contention, nothing in the purported prima facie case indicates that any sequence 
(other than perhaps the entire Hin recombinase DNA binding domain) is "useful for the same 
purpose" as the polyamides disclosed in the Swalley, Parks, and Trauger publications. Similarly, 
nothing in the purported prima facie case indicates that "the arginine and proline residues are 
disclosed in Feng et al. [are] functionally similar to the polyamide structures of Swalley, Parks, 
and Trauger." Paper No. 20, page 7. 

The Examiner has attempted to bolster the fatally flawed prima facie case with two newly 
introduced arguments. First, the Examiner contends that "in both the Hin recombinase and the 
Engrailed protein, the Arg-Pro-Arg sequence is positioned in the amino terminus, and it is in this 
environment that minor groove specific binding of DNA occurs." Unfortunately for the 
Examiner, however, the Arg-Pro-Arg sequence is not positioned at the amino terminus of either 
Hin recombinase or Engrailed. In Engrailed, the residues are in positions 3, 4, and 5; while in 
Hin recombinase, the residues are in positions 140, 141, and 142, where position 1 represents the 
amino terminus. The Arg-Pro-Arg residues do, however, appear to lie at or near the amino group 
end of the helix-loop-helix motif (see, e.g., Feng et al., Figure 3), again emphasizing that it is the 
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* structure of the DNA binding domain in its entirety, and not some isolated property of the Arg- 
Pro-Arg sequence itself, that provides DNA binding properties to Hin recombinase. 

Second, referring to only Arg 140 , the Examiner also contends that "Feng et al. teaches that 
the e-amine of the Arg side chain is capable of interacting with the N-3 group of Adenine, and 
the main chain Arg permits a second hydrogen bond with the 0-2 of Thymine," concluding that 
the structure of Arg "permits a stable interaction with A,T base pairs [in similar fashion to the] 
side-by-side pairing in the polyamides of Parks et al, Swalley et al, and Trauger et al" Paper 
No. 20, page 7. 

Once again, however, the Examiner's arguments only serve to emphasize the hindsight 
reconstruction being employed by the Examiner to justify the flawed obviousness rejection. The 
Examiner provides no insight as to why the skilled artisan would focus on this single amino acid, 

r 

while ignoring the fact that Gly 139 also makes contacts with Adenine, and that substitution of 
Adenine with Guanine would disrupt those contacts. Feng et al., page 351, right column. 
Similarly, one is left without an explanation as to why the skilled artisan would ignore the final 
six residues Ile I85 -Lys 186 -Lys I87 -Arg 188 -Met 189 -Asn 190 , when the Feng et al publication indicates 
that the binding of these residues to DNA "recalls AT-specific binding of minor groove drugs 
such as netropsin and distamycin." Feng et al, page 352, left column. Indeed, the Feng et al. 
publication is replete with description of various Hin recombinase residues that bind to DNA and 
the contacts made by those residues. Thus, the Examiner again selectively considers only specific 
portions of the Feng et al. publication, ignoring numerous sections that clearly demonstrate the 
flaws in the asserted prima facie case. 
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The Examiner has failed to considered the problem to be solved by the instant invention, 
rendering the obvious analysis fatally flawed 

The Court of Appeals for the Federal Circuit has counseled on the importance of 
considering the problem to be solved in any obviousness determination. See, e.g., Sibia 
Neurosciences, Inc. v. Cadus Pharmaceutical Corp., 55 USPQ2d 1927, 1933 (Fed. Cir. 2000). 
Because the Examiner has failed to properly consider the problem to be solved by the instant 
invention, the obviousness analysis performed by the Examiner is fatally flawed and cannot 
stand. 

The Examiner's asserted prima facie case is based on an alleged suggestion in the Feng et 
al. publication that "the Arg-Pro-Arg sequence would maintain specific minor groove binding 
especially when positioned at a terminus of the molecule." Paper No. 20, page 6. But, as pointed 
out in the Appeal Brief, the "positive patch" of the instant claims is not intended to serve a minor 
groove binding function. Rather, the positively charged group is intended to disrupt interactions 
between proteins and the phosphate backbone or major groove of the DNA molecule. See, e.g., 
specification, page 14, lines 20-29. There is nothing of record in the asserted prima facie case to 
indicate that the skilled artisan would be motivated to select an Arg-Pro-Arg sequence, as 
disclosed in the Feng et al. publication, in order to disrupt interactions between proteins and the 
phosphate backbone or major groove of the DNA molecule. 

The Examiner's axiomatic statement in reply that "[although the claims are interpreted in 
light of the specification, limitations from the specification are not read into the claims" (Paper 
No. 20, page 9) completely ignores the extremely relevant question of whether the skilled artisan, 
seeking to design molecules that disrupt interactions between proteins and the phosphate 
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ackbone or major groove of the DNA molecule, would have been led to consider the 
combination of features proposed by the Examiner. Further, the Examiner's reply indicates a 
failure to understand the importance of considering the problem to be solved in performing an 
obviousness determination. Because of this failure, no properly supported prima facie case of 
obviousness has been demonstrated. 



In view of the foregoing, Appellants respectfully submit that the Examiner's asserted 
prima facie case of obviousness is based upon a selective, piecemeal reading of the cited 
publications and a failure to understand and perform the necessary analysis or to provide 
sufficient findings on which a proper obviousness rejection may be based. Instead, it becomes 
clear that the Examiner has relied solely on hindsight in selecting certain components, and 
ignoring others, for combination. Therefore, the Examiner has failed to carry the burden of 
establishing a prima facie case of obviousness. Appellants respectfully submit the instant claims 
are in condition for allowance, and respectfully request that the rejections be withdrawn or 
reversed, and that the rejected claims, together with the claims previously indicated as allowable, 
be allowed to issue. 



Conclusion 



Respectfully submitted, 



FOLEY & LARDNER 





Registration No. 46,230 



P.O. Box 80278 

San Diego, CA 92138-0278 

Telephone: (858) 847-6700 
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Recognition of a 5 / -(A,T)GGG(A,T)2-3 / Sequence in the Minor 
Groove of DNA by an Eight-Ring Hairpin Polyamide 

Susanne E. Swalley, Eldon E. Baird, and Peter B. Dervan* 

Contribution from the Division of Chemistry and Chemical Engineering, 
California Institute of Technology, Pasadena, California 9 1 125 

Received April 8, 1996® 

Abstract: The use of pyrrole— imidazole polyamides for the recognition of core 5'-GGG-3' sequences in the minor 
groove of double strauded DNA is described. Two hairpin pyrrole— imidazole polyamides, [inhnlm-y-PyPyPy-/?- 
Dp and lmhnImPy-y-PyPyPyPy-/?-Dp (Im = /V-mediyliinidazole-2-earboxauiide, Py = yV-inethylpyrrole-2- 
carboxamide, /S — /^-alanine, y = y -amino butyric acid, and Op = ((dime diylamino)p ropy l)amide), as well as the 
correspondmg EDTA affinity cleaving derivatives, were synthesized and dieir DNA binding properties analyzed. 
Quantitative [)Nase I footprint titrations demonstrate that ImlmIm-y-PyPyPy-y9-Dp binds die formal match sequence 
5'-AGGGA-3' with- an equilibrium association constant of A r a = 5 x 10 6 M~ l (10 mM Tris-HCl, 10 uiM KCl, 10 
inM MgCl 2l and 5 mM CaCl* pH 7.0 and 22 °C). hnlmlmPy-y-PyPyPyPy-/?-Dp binds die same site, 5'-AGGGAA- 
3', approximately two orders of magnitude more tightly than the six ring polyamide, with an equilibrium association 
constant of K t = 4 x 10* M* 1 . The eight-ring hairpin polyamide demonstrates greater specifity for single base pair 
mismatches than does the six ring hairpin. Polyamides with an EDTA*Fe(Il) moiety at the carboxy terminus confirm 
that each hairpin binds in a single orientation. The high affinity recognition of a 5'-GGG-3' core sequence by an 
eight ring polyamide containing three contiguous imidazole amino acids demonstrates the versatility of pyrrole- 
imidazole polyamides and broadens the sequence repertoire for DNA recognition. 



Introduction 

Pyrrole— imidazole polyamide— DNA complexes afford a 
general method for die design of non-natural molecules for 
sequence-specific recognition in die minor groove of DNA. 1 " 5 
Polyamides containing //-mediylpyrrole (Py) and W-methyliini- 
dazole (Im) carboxainides bind to the minor groove as side- 
by-side antiparallel dimers and are capable of specific recog- 
nition of sequences containing G,C base pairs, where die N3 
of each imidazole forms a hydrogen bond with a single guanine 
exocyclic ammo group. 1 The side-by-side pairing of an 
imidazole ring from one polyamide with a pyrrole ring from 
the second polyamide recognizes a G*C base pair, whde a 
pyrrole— imidazole combination recognizes a C-G base pair. 1 
Finally, a pyrrole -pyrrole pair recognizes either an A-T or T*A 
base pair. 1,2 By employing the 2:1 model, specific recognition 
of the sequences 5'-(A,T)G(A J)C(A,T)-3', 5'-(A,T)G(A,T) 3 - 
3', 5'-(A,T) 2 G(A,T) 2 -3', and 5 / -(A t T)GCGC(A,T)-3' has been 
achieved. 1-5 

Covaleut head-to-tail linkage of two polyamides by y -ami- 
nob utyric acid (y) to form a "hairpin" polyamide results in bodi 
increased affinity and specificity, as compared to the unlinked 

* Abscract published in Advunce ACS Jbstractx. August IS, 19^6. 

(1) (ai Wade. W. S.: Mrksioh, M; Dervuii, P. B. J. Am. Cham. Soc. 
1992, 114, S7S3. (b) Mrksidu H.\ Wade, W. S.; Dwyer, T. J. ; Gcicrstajigcr, 
B. H.: Wcnnncr, D. E.: Dervan, P. B. Prnc. Natl. Acad. Set. U.S.A. \99l, 
89. 7586. tc) Wade. W. S.; Mrksich, M.; Dervan, P. 8. Biochemistry 1993* 
j\\ 11385. 
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polyamides. 6 For instance, the 1:1 hairpin motif has been used 
to recognize 5'-(A,T)G(A,T) 3 -3' by ImPyPy-y-PyPyPy-Dp with 
approximately 300-fold greater affinity than the unlinked 
polyamides, ImPyPy and PyPyPy, 5 A C-teaninal ^-alanine 
residue increases bodi affinity and specificity and facditates solid 
pliase synthesis, as recently demonstrated with ImPyPy-y- 
PyPyPy-y5-Dp. 7,3 Furthermore, a sequence containing two 
contiguous G-C base pairs, 5'-(A,T)GG(A,T) 2 -3', has been 
recognized by lmImPy-y-PyPyPy-£-Dp. 9 

To further expand the sequence repertoire available with the 
hairpin motif, two polyamides containing three contiguous 
' imidazole rings, lmimbn-y-PyPyPy-£-Dp (1) and ImlmlmPy-. 
y-PyPyPyPy-£-Dp (2), and the corresponding affinity cleaving 
analogs, Imlmlm-y-PyPyPy-^-Dp-EDTA (1-E) and ImlmlmPy- 
y-PyPyPyPy-£-Dp-EDTA (2-E). were synthesized using solid 
phase synthetic protocols (Figures 1 and 2). 7 Specific hydrogen 
bonds are expected to form between each imidazole N3 and 
one of die diree individual guanine 2-amino groups on the floor 
of die minor groove (Figure I). The eight-ring hairpin poly- 
amide, with a pyrrole between the C- terminal imidazole and 
the y -linker, was syndiesized to examine whether the positioning 
of die final imidazole immediately adjacent to the turn would 
adversely affect binding affinity or speciticity. We report here 
the affinities and relative' specificities of diese tris -imidazole 
polyamides as determined by three separate techniques: MPE- 
Fe(ll) footprinting, 10 DNase ( footprinting, u and affinity cleav- 
ing. 12 Information about binding site size is gained from MPE- 
Fe(II) footprinting, while quantitative DNase I footprint titrations 

(6) Mrksich, M.; Parks. M E.; Dervan, P. B. J. Am. Chem. Soc. 1994, 
116, 7983. 

(1) Baird, E. E.; Dervan, . P. B, J. Am. Chem. Snc. 1996, US, 6141. 

(8) Paries, M. E.; Baird, E. E.: Dervan, P. B. J. Am. Chem. Soc. 1996, 
118, 6L47. 

(9) Parks, U. E.; Baird, E. E.; Dcrvaa, P. B. J. Am. Chem. Soc. 1996, 
118, 6153. 

(IO'i (a) Van Dyke, M. W\; Dervan, P. 8. Biochemistry 1983, 22, 2373. 
(b) Van Dyke, M. W.; Dervan, P. B. Nucleic Acids Res. 1983, //, 5555. 
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Imlmlm-y-PyPyPy-p-Dp 
(A,T)GGG(A,T) 

5 ' -W G G G W-3 ' 
3 1 -W C C C W-5 * 




ImlmlmPy-y-PyPyPyPy-P-Dp • 
(A,T)GGG(A,T)(A ( T) 

5 ' -W G G G W W-3 ' 

3 * -W C C C W W-5 ' 

Figure 1. Binding model for the complexes formed between the DNA 
and either die six-ring hairpin polyamidc ImTmrm-}/-PyPyPy-/9-Dp (A) 
or che eight-ring hairpin polyamide ImfniTmPy-y-PyPyPyPy-^-Dp (B). 
Circles witli dots represent lone pairs of N3 of purines and 02 of 
pyriinidiiies. Circles containing an H represent the N2 hydrogen of 
guanine. Putative hydrogeu bonds are illustrated by dotted lilies. Ball 
and stick models are also shown. Shaded and nooshaded circles denote 
imidazole and pyrrole carboxamides, respectively. Nonshaded diamonds 
represeut the ^-alanine residue. W represents either an A or T base. 

allow the determination of equiiibrium association constauts (K a ) 
for the polyainides to a variety of match and single base pair 
mismatch sequences. Affinity cleavage studies confirm the 
binding orientation and stoichiometry of the 1:1 hairpin: DNA 
complex. 



Results 



Synthesis of Polyamides. The polyamides Imlmim-y- 
PyPyPy-£-Dp (1) and [mimimPy-y-PyPyPyPy-^-Dp (2) were 
synthesized in a stepwise manner from Boc-/?-alanine-Pam resin 
using recently described Boc-chemistry protocols in 14 and IS 
steps, respectively. 7 The polyamides were then cleaved by a 
single step amino lysis reaction with ((dime thy lamino)propyl)- 
amine and subsequently purified by HPLC chromatography. The 
syntheses of die polyamides, ImlmimPy-y-PyPyPyPy-^-Dp (2), 
[mimhnPy-y-PyPyPyPy-i?-Dp-NH2 (2-NH*), and ImlrnimPy- 
X-PyPyPyPy-^-Dp-EDTA (2-E), are outlined (Figure 3). For 
the syudiesis of the EDTA analog, a sample of resin is treated 
with 3,3 / -chairu2ip-/V-methyldipropyiamine (55 °C) and then 
purified by preparatory HPLC to provide 2-lHBi. The poly- 
amide ImlmlinPy-y-PyPyPyPy-^-Dp-NH: (2-NH z ) provides a 
tree aliphatic primary amine group suitable for modification. 
This polyamide -amine is theu treated with an excess of the 
dianhydride of EDTA ..(DMSO/NMP, D1EA, 55 °C) and the 
remaining anhydride hydrolyzed (0.1 M NaOH, 55 °C).. The 
EDTA modified polyamide is then isolated by preparatory 
HPLC. 

Identification of Binding Site Size and Orientation by 
iVJPE'Fe(II) Footprinting and Affinity Cleaving. MPE-Fe- 
(IE) footprinting on the 3'- and 5'- 32 P end- labeled 274 base pair 
EcoRUPvuU restriction fragment from plasmid pSESl (25 mM 
Tris-acetate, 10 mM NaCl, 100 pM calf thymus DNA, pH 7.0 . 
and 22 °C) reveals diat the polyamides, each at 10 ^aM 
concentration, are binding to the four designed sites, 5'- 
AGGGA(A)-3 / , 5'-AGGCA(A)-3', 5'-TGGGT(C)-3', and 5'- 
TGGGC(T)-3' (Figures 4 and 5). No footprinting is seen with 
either compound at the single base pair mismatch site, 5'- 
AGGAA(A)-3' (Figure 4, quantitated data not shown). The 
footprinting patterns for the six- ring polyamide arc consistent 
with five base pair btndiug sites, while the patterns for the eight- 
ring polyamide are Indicative of six base pair binding sites. 
Affinity cleavage experiments on the same restriction fragment 
(25 mM Tris— acetate, 200 mM NaCl, 50 /<g/mL glycogen, pH 
7.0 and 22 °C) by the six-ring and eight-ring EDTA-Fe(II) 
analogs reveal cleavage patterns that are 3 '-shifted and appear 
on only one side of each binding site (Figures 4 and 5). 
tmlmJnvy-PyPyPy-^-Dp-EDTA-Fe(H), at 1 ^M, and Imlm- 
rmPy-y-PyPyPyPy-/?-Dp-EDTA-Fe(n), at 100 nM, show cleav- 
age patterns that demonstrate recognition of the same binding 
sites identified by MPE*Fe(Il) footprinting. No carrier DNA 
was used in these experiments, and thus the relative cleavage 
intensities indicate that the six -ring polyamide binds most 
strongly to the two match sites 5'-AGGGA-3' and 5'-TGGGT- 
3'; followed by the end mismatch 5'-TGGGC-3'. The core 
mismatch 5'-AGGCA*3', with little appreciable cleavage at I 
/*M concentration, is bound more weakly. Similarly, the eight- 
ring polyamide binds most strongly at 100 nM to 5'-AGGGAA- 
3', much less strongly to 5'-TGGGTC-3', and not significantly 
to r-TGGGCT-3' and 5'-AGGCAA-3'. 

Determination of Binding Affinities by Quantitative DNase 
I Footprinting. Quantitative DNase T footprint titration experi- 
ments (10 mM Tris-HCl, 10 mM KCI, 10 mM MgCt 2 , and 5 
mM CaCU, pH 7.0 and 22 °C) were performed to determine 

(HMa.i Brenowitz, M.; Senear. D. F.; Sliea, M. A.; Ackers, G. K. 
Methods Eiizymol. 1986, 130, 132. (b) Brenowitz, M.; Senear, D. F.; Shea, 
M. A.; Ackers, G. K. Proc. Nutl. Acad. Set U.S.A. 1986, S3. 3462. (c) 
Senear, O. F.; Brenowitz, M.; Shea, M. A.; Ackers, G. K. 8iochemistrv 
1986, 25, 7344. 

(12) (a) Schulo, P. G.; Taylor, J. S.; Dervan, P. B. J. Am. Chem. Soc. 
1982, 104, 6861. (b) Schultz, P. G.; Dervan, P. B. J. Bio nm I. Struct. Oyn. 
1984, /, 1 133. (c) Taylor, J. S.; Schultz, P. B.; Derail, P. B. TcrahcJron 
1984, 4Q % 457. 
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1-E ImImIm-Y-PyPyPy-(3-Dp-EDTA*Fe(lI) 
Figure 2. Structures of the rris-imidnzole polyamides and tlieir EDTA 

the equilibrium association constants of the polyamides for the 
four bound sites (Table l). u hnhrdxn-y-PyPyPy-/?-Dp binds 
the two match sites 5'-AGGGA-3' and 5'-TGGGT-3' with 
. equilibrium association constants of = 4.6 x IO 6 and 7.6 x 
IO 6 M _l , respectively (Figures 6-8). The sequence 5'-TGGGC- 
3', which lias a mismatch in the non-core final position (an "end" 
mismatch), is bound with 6-fold lower affinity than the best 
match, while the core mismatch sequence 5'-AGGCA-3' is 
bound with 9-fold lower affinity. rmlmlmPy-y-PyPyPyPy-£- 
Dp binds the match site 5'-AGGGAA-3' with an equilibrium 
association constant of K t = 3.7 x 10 s M" 1 . The end mismatch 
5'-TGGGTC-3' is bound with 26-fold lower affinity, and the 
two core-mismatches, 5'-AGGCAA-3' and 5'-TGGGCT-3\ are 
bound with 130-fold and 220-fold lower affinity, respectively. 
Neither polyamide shows any appreciable binding to the core 
mismatch 5'-AGGAA(A)-3' (data not shown). 

Discussion 

The two hairpin polyamides recognize the targeted 5'- 
AGGGA(A)-3' sequence, as determined by MPE-Fe(II) foot- 
priiuing and affinity cleaving. demonstrating specific recognition 
by polyamides of sequences containing three contiguous G*C 
base pairs, 5'-GGG-3'. Affinity cleavage data indicate that each 
polyamide is binding in die minor groove in a single orientation, 
consistent with the hairpin binding model (Figures 1 and 5). 
The relative intensities of the cleavage patterns are consistent 
with quantitative ONase i footprint titration results. 

Quantitative DNase I footprint titrations reveal that tmlmlm- 
y-PyPyPy-/?-Dp binds the designed match sites 5'-AGGGA*3' 
and 5'-TGGGT-3' with equilibrium association constants of K t 
= 5 x 10 6 and 8 x IO 6 M -1 , respectively. For comparison, 
the analogous six -ring hairpins containing only one. and two 
contiguous imidazoles, ImPyPy-y-PyPyPy-^-Dp and fmlmPy- 
y-PyPyPy-£-Dp, bind tlieir respective match sites widi affinities 
of-^a ^ '0 a M~ 1 , 8 - 9 The addition of the third imidazole reduces 
binding affinity significantly, perhaps due to the inability of 
the polyamide to sit deeply in the stericaily crowded minor 
groove, decreasing energetically favorable van der Waals 
contacts. Examination of the eight-ring hairpin, ImfmfmPy-y- 
PyPyPyPy-£-Dp, reveals a dramatically increased affinity, the 
match site 5'-AGGGAA-3' being bound with an equilibrium 
association constant of = 4 x 10 s M" 1 . The 80-fold increase 
in affinity in expanding from a three- ring to a four-ring hairpin 
polyamide mirrors the 66-fold enhancement of ImPyPyPy-Dp 




2 ImlmlmPy-Y-PyPyPyPy-P-Dp 




2-E ImImImPy-Y-PyPyPyPy-P-Dp-EDTA*Fe(IJ) ° 



derivatives synthesized using solid-phase methodology. 7 

over ImPyPy-Dp that was observed in the 2:1 horaodimeric 
polyamide:DNA motif. 13 

In addition to die observation that die eight-ring ImlmlmPy- 
y-PyPyPyPy-^-Dp binds with higher affinity dian the six ring 
ImImIm-y-PyPyPy-/?-Dp, the enhanced specificity is perhaps 
more significant. The six- ring hairpin polyamide binds 5'- 
TGGGC-3', an end mismatch site, with 6- fold lower affinity 
compared to 5'-TGGGT-3' (the highest affinity match), while 
the eight-ring hairpin polyamide binds its end mismatch site 
5'-TGGGTC-3' with 26-fold lower affinity compared to its 
match sate S^AGGG AA-3'. Similarly, 5'- AGGCA-3', a site 
containing a single base pair core mismatch, is bound by the 
six -ring system with 9- fold lower affinity, while the two core 
mismatch sites for the eight-ring system, 5'-AGGCAA-3' and 
5'-TGGGCT-3', are recognized with 130-fold and 220-fold 
lower affinity, respectively. For both polyamides, binding of a 
site with a core single base pair mismatch results in a greater 
energetic penalty than binding of a site with single base pair 
mismatch in the end position. The increased specificity of 
ImImImPy-y-PyPyPyPy-/?-Dp over ImJinIm-y-PyPyPy-/?-Dp 
may indicate that an imidazole placed immediately to the 
N-terminal side of the y turn docs not form as strong a hydrogen 
bond as in other positions. 

Implications for the Design of Minor Groove Binding 
Molecules. Pyrrole- imidazole polyamides have been used to 
recognize a variety of target sequences containing A-T and G-C 
base pairs. 1 * 2,4 ' 5,9 By recognizing sequences containing three 
contiguous G'C base pairs, 5'-(A,T)GGG(A,T)-3' and 5'-(A,T)- 
GGG(A,T)2-3', this work expands the sequence-composition 
repertoire targetable by the hairpin polyamide motif. Both 
affinity and specificity for a G,C rich sequence are increased 
by die use of an eight-ring hairpiu polyamide. This ability to 
enlarge the sequence repertoire, combined with rapid solid-phase 
synthesis, brings us one step closer to a universal approach for 
the recognition of any desired DNA sequence by strictly 
chemical methods. 

Experimental Section 

Dicyclohexylcarbodiimide fDCC), hydroxybenzotri azote (HOBt), 
2-(l rY-ben20triazole- 1 -yl)- 1 , 1 ,3,3-tetramethyluronium hexafiuorophos- 
phate (HBTU), and 0.2 minol/g of Boc-£-alanine-(4-carboxamido- 
methyl) benzyl ester- cop oly(sryrene-divmylbcnzene) resm (Boc-0- 
Pam-Resinj were purchased from Peptides International. /V,/V-Diiso- 

(I3i Kelly, i. J.; Baird, E. E.; Oervan, P. 8. /Vac. Nutt. Acad. Set USJ. 
1996, 9J. 6981. 
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R = CHj; ImlmlmPy-Y-PyPyPyPy-P-Op <2) 
j — . R a (CH 1 ) 3 NH,; ImImImPy-Y-PyPyPyPy-iJ-Dp-NH l (2-NH,) 
xx. xxi 



'XV>-, 
To TV B v. 
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ImlmlmPy-y-PyPyPyPy-P-Dp-EDTA (2-E) 



T, g u,e3. (Box,Pynoleand^ 

6.' and ^dazoie-2-carboxylic acid 7 - Solid-phase synthettc scheme mmol/g): (i) 80% TFA/DCM, 0.4 M 

tmlmln^y.PyPyPyPy^Dp-EDTA prepared from cooun ere, ally avadab S^MP; (v) 80% TFA/DCM, 0.4 M PhSH; (vi) 

PhSH: (ii) BocPy-OBt, DfEA. DMF; (m) -80W 'J^PCM 0^4 * ix) 30 o/ TFA/DCM, 0.4 M PhSH; (x) Boc-y- 
BooPv-OBt, DTE A. DMF; (vii> 80% TFA/DCM. 0.4 M PhSH; ( V ;^^ B ^ ?J in BocPv-OBc DIE A DMF; (xiii) 80% TFA/DCM, 0.4 M 
aminobuiyric acid (HBTU, DfEA), DMF; (xi) S0% *J ^^-OBt (bcC/HOBc) DI£A. DMf : (xvii) 80% 
PhSH; t xiv) Boclm-OBc (DCC/HOBc), Dl£A, OM^)W^ ^ ^ylaiinom^pyl)™ or 3.3^iaraino-*- 
TFA/DCM 0 4 M PhSH; (xvm, imidazole-2-carboxylic aud (HBTU/UlfcAJ. > W*\ U'- 7 
mXldipropyUminc, 55 -C;(x,J EDTA-dianhydridc, DMSO/NMP, D1EA. >5 *C; (xx,) 0.1 M NaOH. 



propylethylamiue (DlEA). W-dimethy lfonnam.de (DMF), tf-meth- 
ylpyrrolidooe (NMP). DMSO/NMP, acetic anhydride (AciO), and 
0 0002 M potassium cyanide/pyridine were purchased from Applied 
Biosystems. Boc-y -ami no butyric acid was from NOVA Biochem, 
dichloromethane (DCM) and trietbylamine (TEA) were reagent grade 
from EM, thiophenol (PhSH), ((dimethyl armuo)propyl)arnine was trom 
Aldrich trifluoroncetic acid (TFA) was from Halocarbon, phenol was 
from Fisher, and ninhydrin was from Pierce. All reagents were used 
without further purification. 

Quik-Sep polypropylene disposable fdters were purchased from 
Isolab Inc. and were used for filtration of DCU. Disposable polypro- 
pylene filters were also used for washing resin for uiuhydrin and picne 
acid tests and for filtering predissolved amino acids into reaction vessels. 
A shaker for manual solid-phase synthesis was obtained from St. John 
Associates, Inc. Screw-cap glass peptide synthesis reaction vessels (5 
and 20 mt) with a « sintered glass frit were made as described by 
Kent. 14 l H NMR spectra were recorded on a General Electric-QE NMR 

(14) Kent, S. B. H. Anntt. Rev. Biochem. 1988, J7, 957. 



spectrometer at 300 MHz in DMSO-rf*. with chemical shifts reported 
in parts per million relative to residual solvent. UV spectra were 
measured in water on a Hewlett-Packard Model 8452A diode array 
spectrophotometer. Matrix -assisted, laser desorptioo/ionizauon time, 
of flight mass spectrometry (MALDl-TOF) was performed at the Protein 
and Peptide Microanalytical Facility at the California Institute of 
Technology. HPLC analysis was performed on either a HP 1090M 
analytical HPLC or a Beckman Gold system using a RAIN EN Cu, 
Microsorb MV. 5 /*m. 300 x 4.6 mm reversed phase column in 0.1% 
(wt/v) TFA with acetonitrile as elueut and a flow rate of 1.0 nuVmin. 
cradient elution 1.25% acetonitrile/ min. Preparatory reverse-phase 
HPLC was performed od a Beckmau HPLC with a Waters DeltaPak 
25 x 100 mm, 100 pm C18 column equipped with a guard, 0.1% (wt/ 
v) TFA, 0.25% acetouitrile/min. Tlie 18MQ water was obtained from 
. a Millipore MilliQ water purification system, and all buffers were 0.- 
/*m filtered. 

Iniluilui-y-PyPyPy-^Dp (1). The product was synthesized by 
manual solid-phase protocols 7 and recovered as a white powder (2.4 
mg. 4% recovery). UV X^a 312 (48 500); <H NMR (DMSO-J.) 6 
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Figure 4 MPE-Fe(II) footprinting and affinity cleavage experiments 
on a 3'-^P-tabeled 274 bp EcoRVPtmU restriction fragment from 
plasmid pSESl . The 5'-AGGGA(A)-3', 5'-AGGCA(A).3'. 5'-TGGGT- 
(C)-y 5'-TGGGC(T)-3\ and 5'-AGGAA(A)-3' sites are shown on the 
right side of the aucoradiogram. Lane I , intact DNA; lane 2, A reaction; 
lane 3 G reaction; lanes 4 and 11, MPE-Fc(U) standard; lane 5. 10 
uM lmImImPy-y-PyPyPyPy-0-Dp; lanes 6 and 7, 10 nM and 100 ruM 
LmImImPy-y-PyPyPyPy-/?-Dp-EDTA-Fe(tI); lane 8, 1G>M Imlmlm- 
. y-PyPyPy-0-Dp; lanes 9 and 10. 100 nM and 1 tmluiIm-y-PyPyPy- 
S-Dp-EDTA-Fefll). All lanes coatain 15 kepm 3'-radiolabeled DNA. 
The control and MPE-FefTH lanes (l, 4, 5, 8, and 1 1) contain 25 mM 
Tris-acecate buffer (pH 7.0), 10 mM NaCl, and lOO^M/basc pair calf 
thymus DNA. The affinity cleavage lanes (6, 7, 9, and 10) contain 25 
mM Tris-acetatc buffer (pK 7.0), 200 mM NaCl, and 50 /igtaL 
glycogen. 

10.09 (s, I H), 9.89 (s. I H). 9.83 (s, 1 H). 9.83 (s, I H). 9.57 («. I H). 
9 19 (br s 1 H), 8.36 (t. 1 H. J - 5.6 Hz), 8.03 (m, 2 H), 7.64 (s, 1 H), 
751 (s I HI 7.45 (s, I H), 7.20 (d, I H./- 1.0 Hz), 7.15 (it, 1 H,J 
- 20 Hz) 7.14 («. 1 H), 7.08 (s, 1 H). 7.04 (s, 1 H), 6.37 (d, 2 H. J 
= 2 2 Hz) 4 01 (s. 3 H), 3.99 (s, 3 H), 3.95 (s, 3 H), 3.82 (s, 3 H). 
3 82 (s 3 H). 3.79 (s, 3 H), 3.37 (q, 2 H, / = 5.8 Hz), 3.26 (q. 2 H, J 
= 6 1 Hz) 3.10 (q, 2 H, J = 6.1 Hz). 2.99 (m. 2 H), 2.73 (d, 6 H, J 
= 4 8 Hz). 2.34 (UH,;= 7.2 Hz). 2.27 (UH,/ = 7.3 Hz). 1 .79 
(m 4 HV MALDt-TOF-MS, 980.1 (980.1 calcd for M + H). 
' ImlmimPy-yPyPyPyPy-A-Dp (2)- The product was syuthesized 
by manual solid-phase protocols 7 and recovered as a white powder (7.6 
m K 1 1% recovery). UV l m 243 (42 000). 312 (48 500); 'H NMR 
(DMSO-4) 6 10.32 (s, I H), 10.13 (s, 1 H). 9.93 (s. 1 H). 9.90 (s, 1 
H) 9.89 (s, I H), 9.84 (s, I H), 9.59 (s, I H), 9.23 (br s, 1 H). 8.09 (t. 
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I H / - 5-3 Hz), 8.04 (m, 2 H). 7.65 (s, 1 H). 7.57 (s, 1 H). 7.46 (d, 
I H/a 0.6 Hz), 7.22 (m.,3 H), 7.16 (s. 2 H). 7.09 (d. I H, J = 0^8 
Hz)' 7.06 (d, 2 H. 7 = 1.1 Hz). 7.00 (d,lH,/ = 1.7), 6.83 (d, I H 
J = 1.8), 6 37 (d. 1 H. J = 1.8 Hz), 4.02 (,. 3 H). 4.00 (s. 3 H). 3.99 
(s 3 H). 3.84 (s, 3 H), 3.S3 (s, 3 H), 3.33 (s, 3 H). 3.80 (s, 3 H). 3.79 
' H) 3.37 q. 2 H, / = 6.2 Hz). 3.21 (q. 2 H. / = 6.4 Hz). 3. 10 (q 
7 H " = 6.2 Hz). 3.00 (m. 2 H). 2.73 (d, 6 H. J = 4.9 Hz), 2.34 (t 2 
H, ) = 7.2 Hz). 2.28 (t,2H,;= 7.0 Hz), 1.76 (m, 4 H); MALDl- 
TOF-MS, 1225.9 (1224.3 calcd for M + H). 

li U Inilui-V-PyPy! > y-/'- D P- NH ' d-NHi). A sample of machine- 
synthesized resin (350 mg. 0.17 mmol/g 11 ) was placed in a 20 mL glass 
scintillation vial and treated with 2 mL of 3.3'-diamino-^methyldipro- 
pylamiue at 55 «C for 18 h. The resin was removed by filtrat.oo through 
a disposable propylene filter, and the resulting solution was dinolvtd 
in water to a total volume of 8 mL and purified directly by preparatory 
reversed-phase HPLC to provide Imlmlm-y-PyPvPy^-C'P-NHj (28 mg. 
41% recovery) as a white powder. 1 H NMR (DMSO-rf*) 6 10. 14 (*, 1 
H) 9 89 (s, 1 H), 9.88 (s. I H), 9.83 (s, 1 H). 9.6 (br s. I H). 9.59 (s. 

1 H) 8.36 (t. I H. J = 5.5 Hz), 8.09 ft 1 H, ./ = 5.0 Hz). 3.03 (t. I H 
J = 5 0 Hz), 7.9 (br s, 3 H). 7.63 (s. 1 H). 7.50 (s, 1 H). 7.44 (s, 1 H). 
7 19 (d. I H, J = 1-2 Hz), 7.13 (m. 2 H). 7.08 (d, 1 H. J = 1.3 Hz). 
7 02 (d. 1 H, J = 1-2 Hz), 6.85 (m, 2 H). 4.01 (s. 3 H). 3.99 (s, 3 H). 
3 97 (m, 6 H). 3.80 (s. 3 H). 3.77 (s, 3 H). 3.34 (q. 2 H. / = 5.3 Hz). 
3 23 (q 2 H J = 6.0 Hz), 3.05 (m, 6 H). 2.33 (q, 2 H. 7 = 5.0 Hz). 

2 70 (d 3 H. / = 4.0 Hz), 2.32 (t, 2 H, / = 6.9 Hz), 2.25 (t, 2 H,/ = 
^9 Hz). 1.90 (m, 2 H). 1-77 (m, 4 H); MALD1-TOF-MS. 1022.3 

• (1023.1 calcd for M + H). 

ImlmliuPy-y-PyPyPyPy-^DP-N 15 ' P-NHO. A sample of machine- 
synthesized resin (350 mg. 0.16 mmol/g») was placed io a 20 mL glass 
scintillation vial and treated with 2 mL of 3,3'-diamino^-methytdjpro- 
pylamine at 55 °C for 18 h. The resin was removed by filtration through 
a disposable propylene filter, and the resulting solution was dissolved 
in water to a total volume of 8 mL and purified directly by preparatory 
reversed-pUase HPLC to provide traliidniPy-y-PyPyPyPy-^-^-NHi 
(31 me 40% recovery) as a white powder. 'H NMR (DMSO^) O 
10 37 (s. 1 H). 10.16 (s. 1 H). 9.95 (s, 1 H). 9.93 (s, 1 H). 9.91 (s. 1 
H)' 9 86 (s, 1 H), 9.49 (brs, 1 H), 9.47 (s, 1 H). 8.12 (m, 3 H), 8.0 (br 
s i H) 7.65 (s. 1 H). 7.57 (s, 1 H). 7.46 (s, 1 H). 7.20 (m. 3 H). 7.16 
(m 2 H), 7.09 (d. 1 H. J = 1.5 Hz). 7.05 (m. 2 H). 7:00 (d, lH,/ = 
. 1.6 Hz). 6.88 (m. 2 H). 4.01 (s. 3 H). 3.99 (s. 3 H), 3.98 (s. 3 JO. 3.83 
(s, 3 H). 3.82 (s, 3 H). 3.81 (s. 3 H). 3.79 (s. 3 H). 3.73 (s 3 H) 3.36 
fq 2H> 5.3 Hz). 3.21 -3.05 (m. 3 H). 2.35 (q. 2 H, y = 4.9 Hz), 
2 7 1 (d 3 H, y = 4.4 Hz). 2.34 (t. 2 H, J = 5.9 Hz), 2.26 (t, 2 H, / = 
5.9 Hz). 1.35 (quintet. J = 5.7 Hz). 1.72 (m. 4 H); MALD1-TOF-MS, 
1267.1 (1267.4 calcd for M + H). 

Uuimlui-y-PyPyPy/J-Dp-EDTA (1-E). EDTA dianhydride (50 
mg) was dissolved in 1 mLofDMSO/NMP solution and 1 mLofDIEA 
by heating at 55 °C for 5 min. The dianhydride solution was added to 
lml 1 nlm-y-PyPyPy-/J-Dp-NH I (l-NH l )(8.0 mg. 7 ^mol) and dissolved 
in 750 uL of DMSO. Tlie nuxture was heated at 55 °C for 25 min, 
treated with 3 mL of 0.1 M NaOH, and heated at 55 °C for 10 mm.. 
TFA (0 1%) was added to adjust the total volume to 3 mL and the 
solution was purified directly by preparatory HPLC chromatography 
to provide 1-E as a white powder (3.3 mg, 30% recovery). 'H NMR 
(DMSCW 6 )<5 10.14(s, 1 H),9.90(s, I H). 9.89 (s. I H),9^85 (s I H), 

9 58 (s 1 H). 9.3 f> s. ' H )- S- 40 ( m - 2 8 - 02 ( m - 1 ^ 7,65 (s ' 
H) 7 51 (s. 1 H). 7.45 (s. I H). 7.20 (d, 1 H. J = 1.5 Hz). 7.15 (m. 2 
H)' 7.08 (d. 1 H. y = 1.1 Hz). 7.04 (d. 1 H, J - 1.5 Hz). 6.86 (m. 2 
H *. 4.00 (s. 3 H), 3.98 (s, 3 H), 3.94 (s. 3 H), 3.87 (m. 4 H), 3.82 (s, 
3 H) 3 81 (s, 3 H), 3.78 (s, 3 H). 3.72 (m. 4 H). 3.4-3.0 (m. 16 H), 
2 7 1 W 3 H. y - 4.2 Hz). 2.33 (t, 2 H, J - 5. 1 Hz). 2.25 (t. 2 H, J = 
s".9 Hz). 1.75 (m, 6 H); MALD1-TOF-MS, 1298.4(1298.3 calcd forM 
+ H). " ■ " 

(15) Resin subsrimtion hx<i been corrected for the weight of the polyamidc 
chain.'T1.e change in substitution during a specific coupling or for the enure 
svnthesis can be calculated as ^(mmol/g) = W(l +" Uirf'"^ ~ "m\ 
x 10-i | where L is the loading (mmol of amine per gram of resin), and if 
is the weight (g mol") of the growing polyamidc attached to the wm. 
See: Barlos. K.; CTatzi, O.; Gatos. 0.; Stravropoulos. G. Int. J. Peptide 
Protein Res. 1991. 37. 513. 
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1 Am. Chem. So*.., Vol. 118, No. 35, 1996 8203 

PvuH 



A- Imlmlm-y-PyPyPy-P-Dp 



5 ■ -GGTGTCAT 
3 * -CCACAGTAT' 




TACGCGGACIjAGGCAjAATGCCGCTG. 
'ATGCGCCTGA/rCCGwTACGGCGAC 




B. ImImIm-Y-PyPyPy-|3-Dp-EDTA-Fe(II) 



4iv_ 



C Inilmlm- y-Py PyPy- P-Dp-EDTA-Fe(II) 



S 1 - GGTGTCAT AAjAGGGA AT ACGCGGACTAGGCA AATGCCGCTCAfcGGGTfcAGTGCTCAGAh?GGGCTCAGCTTGGCGT-3 ' 
3 • -CCACAGTAT'ttEC^C^h'ATGCGCCTGAbcC^GTjrTACGGCGACTACCCAbTCACGAGTC TiACCCt^ GTCGAACCGCA-S ' 

A 



S ' -GGTGTCATAAAGGf^TACGCGGACTAGGCAAATGCCGCTCATf^ ' 
3 ' -CCACAGTATTTCCCTTATGCGCCTGATCCGTTTACGGCGACTACCCAGTCACGAGTCTACCCGAGTCGAACCGCA-5 ' 



D. ImlmlmPy-y-PyPyPyPy-p-Dp 




5 1 -GGTGTCAT AAAGGGAAfTACGCGGAC 
3 ' -CCACAGTATTTCCCTmTGCGCCTG, 



■AjAGGGAAfTA 

t ttcccttI at 

w 




AGCTTGGCGT - 3 * 
TCGAACCGCA-5' 



E., TmTmTmPy-Y-PyPyPyPy-P-Dp-EDTA-Fe(rT) 



5 * -GGTGTC ATAA AGGGAA TACGCGGACT AGGCAA \TGCCGCTGA TGGGTC AGTGCTCAGAfTGGGC it AGCTTGGCGT - 3 ' 
3 ' -CCACAGTAT TOCCCTTI ATGCGCCTG OTCCG^ ' 



F. IinrmrmPy.Y-PyPyPyPy.p-Dp-EDTA-Fe(H) 



5 ' - GGTGTC AT AAAGGGAATACGCGG ACT AGGCAAATGCCGCTGATGGGTCAGTGCTCAGATGGGCTC AGCTTGGCGT -3 ' 

3 ' -CCACAGTATTTCCCTTATGCGCCTGATCCGTTTACGGCGACTACCCAGTCACGAGTCTACCCGAGTCGAACCGCA-S ' 

Figure S. Results from MPE-Fe(II) footprinting of [mImIm-y-PyPyPy-0-Dp and [mImlmPy-y-PyPyPyPy-£-Dp aud affinity cleavage of Imimlm- 
y-PyPyPy^-Dp-EDTA-Fe(ir) and lrrdndmPy-y-r>PyPyPy-£-Dp-EDTA-Fe(tI). (Top) Illustration of the 274 bp restriction fragment with the position 
of the sequence indicated. Boxes represent equilibrium binding sites determined by the published model. Only sites that were quantitated by DNase 
I footprint titrations are boxed. (A and D): tVtPE'FeCII) protection patterns for polyamides at 10 /iM concentration. Bar heights are proportional to 
the relative protection from cleavage at each baud. (B aud E): Affinity cleavage patterns of lmImlm-y-PyPyPy-0-Dp-EDTA-Fe(II) at 1 /iM and 
of tmImimPy-y.PyPyrVPy-^-Dp-EDTA-Fe(II) at 100 nM, respectively. Arrow heights are proportioual to the relative cleavage intensities at each 
base pair. (C and F): Ball and stick binding models for the single orieutation binding to formal match sequences by the six-ring and eight-ring 
EDTA-Fe(lI) analogs, respectively. Shaded and uonshaded circles deuote imidazole aud pyrrole carboxamides, respectively. Nonshaded diamonds 
represent the ^-alanine residue. TTie boxed Fe denotes the EDTA'Fe(II) cleavage moiety. 

ImlmlmPy-y-PyPyPyPy-^Dp-EDTA (2-E). Compound 2-E was H). 9.89 (s, I H), 9.84 (s. I H), 9.57 (s. 1 H), 8.42 (m, 1 H), 8.03 (m, 
prepared as described for compound 1-E (yield 3.8 tng, 40%). »H NMR 3 H), 7.64 (s, I H) t 7.56 (s, I H), 7.44 (s, 1 H), 7.20 (m, 3 H), 7.15 (m, 
(DMSO-rf*) 6 10.34 (s, I H), 10.11 (s, I H), 9.92 (s, 1 H), 9.90 (s, 1 2 H), 7.07 (d,lHJ = 1.6 Hz), 7.05 (m, 2 H), 6.99 (d, 1 H, J = 1.6 
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Table 1. Equilibrium Association Constants (M~ *)*•** 



Swalley et al. 



— end mismatch core mismatches 

polyamide 5'-AGGGA-3' S'-TCCGTO' 5'-TCGGC-3' 5'-AGCCA-3' 

Imlnilm-y-PyPyPy-jff-Dp 4,6 x 10 A (0.3j 7.6 x 10* (0.5) 1.3 * 10 A (0.3) 3.6 x 10* (0.4) 



match site end mismatch core mismatches 

polyamide 5'-AGGGAA-3' 5'-TGGGTCO' 5'-TGGGCT-3' 5'-AGGCAA-3' 

ImImrmPy-v-PyPyPyPy-/?-Dp 3.7 x I0 1 (0,3) • 1.4 x 1 0 7 (0.5) 1.7 x 10* (0.3) 2.9 x 10* (0.3) 

" Values reported are the mean values measured from a minimum of three DNase I footprint titration experiments, with the standard deviation 
for each data set indicated in parentheses. * The assays were performed at 22 °C at pH 7.0 in the presence of 10 mM Tris-HCl, 10 mM KCI, 10 
mM MgCIj. and 5 mM CaCI-.. c Base pairs chat arc in bold represent formal mismatches. 



2 l % tmlmtmPy-r-PyPyPyPy-p-Op 




m *»■ .si* ;m >». <#+ <v» .fa &m -iff 

.: <•* «fc «gr •% !» 'Jj. : .» 4 r*U 

Figure 6. Quantitative DNase T footprint titration experiment with 
[mlniimPy-y-pypyPyPy-0-Dp ou the ScoBMPuutl restriction fragment 
from plasmid pSESl: lane 1, intact DNA; lane 2, A reaction; -lane 3, 
G reaction; lane 4, DNase I standard; lanes 5-20, 20 pM, 50 pM, 100 
pM, 200 pM, 500 pM, 1 nM. 2 nM, 5 uM, 10 nM. 20 nM. 50 nM, 100 
nM. 200 nM, 500 nM, 1 /M, 2 /<M ImImImPy-j/-PyPyPyPy-/f-Dp. 
The 5'-AGGGAA-3\ 5'-aGGCAA-3', 5'-TGGGTC-3', and 5'-TGGGCT- 
3' sites that were analyzed are shown on the right side of the 
autoradiogram. All reactions contain 20 kepin restriction Fragment, 10 
mM Tris-HCl (pH 7.0), 10 mM KCI, 10 mM MgO u and 5 mM CuCl : . 

Hz). 6.S7 (m, 2 H), 4,00 (s, 3 H), 3.98 (s, 3 H), 3.97 (s, 3 H), 3.33 (m, 
4 H) f 3.32 (s. 6 H), 3.79 (s, 3 H), 3.73 (s, 6 H), 3.67 (m, 4 H), 3.4-3.0 
(m, 16 H), 2.71 (d, 3 H, / = 4.2 Hz), 2.34 (t, 2 H, / = 5.4 Hz), 2.25 
(t. 2 H,y = 5.9 Hz), 1.72 (m, 6 H); MALDI-TOF-MS, 1542.2 (1542.6 
calcd for M + H). 

DNA Reagents and Materials. Enzymes were purchased from 
Boehringer-Mannheim or New Eu gland Bio labs aud were used with 
their supplied buffers. Deoxyadenosiue and thymidine 5'-[a- n P]. 
triphosphates and deoxyadeuosine 5 '-[y- 32 P Jtriphosphate were obtained 
from Amersham, Purified water was obtained by filtering doubly- 
distilled water through the MilliQ filtration system from Millipore. 
Sonicated, deproteinized calf thymus DNA was acquired from Phar- 
macia. Ail other reagents and materials were used as received. All 
DNA manipulations were performed according to standard protocols, 16 




[Imlmlm-Y-FyPyPy-P-Dp] (M) 




[ImlmlmPy^y-PyPyPyPy-P-Dpl (M) 

Figure 7. Data from the quantitative DNase I footprint titration 
experiments for the two polyamides, UTUmlrn-y-PyPypy./J-Dp (top) 
and liniinJmPy-y-PyPyPyPy-/?-Dp (bottom), in complex with the 
designated sites. The 0 ftomi points were obtained using photostimulable 
storage phosphor autoradiography and processed as described in the 
Experimental Section. The data points for 5'-AGGGA(A)-3' t 5'- 
TGGGT(C)-3'. 5'-AGGCA(A)-3'. and 5'-TGGGC(T)-.V sites are indi- 
cated by filled circles (•). open squares (Q), filled inverted triangles 
(r), and open circles (O), respccrively. The solid curves arc the best- 
fit Langmuir binding titration isotherms obtained from the nonlinear 
least-squares algorithm using eq 2. 

Construction of Plasmid DNA. The plasmid pSESl was con- 
structed by hybridization of the inserts, 5'-GATCCGGTGTCAT- 
AAAGGGAATACGCGG ACTAQGCAA ATGCCGC- 
TGATGGGTCAGTGCTCAGATGGGCTC-3' and 5'-AGCTGAGC- 



(\6) Sambfoolc, J.; Fritsch, E. F.; Maniatis, T. Molecular Cloning Cold 
Spring Harbor Laboratory: Cold Spring Harbor, NY, 1989. 



Recognition of Core S'-CCGS' Sequences 

A. Match Match 

5 ' -A G G G A-3 ' 5 ' -A G G G A A- 3 '■ 

tKxxxy +x>ooo<y 

3'-T.C C C T-5' 3'-T C C C T T-5' 



B. 



5 x 10 6 M A 



Match 



4 x 10 3 M* 1 



End Mismatch 



5 ' -T G G G T-3 1 5 ' -T G G G T (5}-3 ' 





tKXXxy +> „ 

3 ' -A C C C A-5 * 3 * -A C C C-A EJ-S 1 



8x10° M 



End Mismatch 



7 \>rl 



1x10' M 



Core Mismatch 



5 1 -T G G G 0-3' 5 ' -T G G G @ T-3 
3 '-A C C C (Gj-5' 




Core Mismatch 
-A G G (C] A-3 1 



3 '-A C C C O A-5 



2xl0 6 M* 1 



Core Mismatch 
-AG G (C] A A-3 




3 ' -T C C |G] T-5' 
9 x 10 5 M" 1 




•T C C i T T-5 ' 
3 x 10 6 M' 1 



Figure 8. Ball and stick models of InaImInvy-PyPyPy-/ff-Dp (left) and 
tmlmImPy-y-PyPyPyPy-0-Dp (right) for each binding site, with the 
corresponding equilibrium association constants shown below each 
individual model. The binding sites shown are (a) 5'-AGGGA(A)-3\ 
(b) 5'-TGGGT(C)-r, (c) S'-TGGGC(TV3', and (d) 5'-AGGCA(A)-3'. 
Shaded and nonshaded circles denote imidazole and pyrrole carbox- 
amides, respectively. Nonshaded diamonds represent the /S-alanine 
residue. Formally mismatched base pairs are boxed. 

CCATCTGAGCACTGACCCATCAGCGGCATTTGCCTAGT- 
CCGCGTATTCCCTTTATGACACCG-3'. The hybridized insert was 
ligated into linearized pUCl9 BamHUHiiidlll plasmid' using T4 DNA 
ligase. The resultant constructs were used to transform Epicuriaa Coli 
XL-2 Blue competent cells from Stratageue. Ampicill in-resistant white 
colonies were selected from 25 mL of Luria-Bertaui medium agar plates 
containing 50 /ig/mL ampicillin and treated with XOAL and IPTG 
solutions. Large-scale plastnid purification was performed with Qiagen 
Maxi purification kits. Dideoxy sequencing was used to verify the 
presence of the desired insert. Concentration of the prepared plasmid 
was determined at 260 nm using the relationship of 1 OD unit = 50 
/ig/mL of* duplex DNA. 

Preparation of 3'- and S'-Eud-Labclcd Restiiction Fragments. 
The plasmid pSESl was linearized with EcoRS and then treated with 
either Klenow fragment, deoxyadenosiae 5'-[a* 33 Pltriphosphate and 
thymidine 5'-[a-"P|triphosphate for 3' labeliug, or calf alkaline 
phosphatase and then 5' labeled with T4 polynucleotide kinase aud 
deoxy adenosine 5 ''[y- )2 P [triphosphate. The labeled fragment (T or 
5') was then digested with Puu[\ aud loaded onto a 5% uon -denaturing 
polyacrylamide gel. The desired 274 base pair band was visualized 
by autoradiography and isolated. Chemical sequencing reactions were 
performed according to published methods. 17 

MPE-Fe(II) Footprintbig. All reactions were carried but in a 
volume of 40 ^L. A polyamide stock solution or water (for reference 
lanes) was added to an assay buffer where the final concentrations were 
25 mM Tris-acetate buffer (pH 7.0), 10 mM NaCl, lOO^MTbase pair 
calf thymus DNA, aud IS kepm 3'- or 5'-radiolabeled DNA, The 
solutions were allowed to equilibrate for 4 b. A fresh 50 /iM MPE> 

(1 7» <a'i Iverson, B. L; Dervan, P. B. Nucleic Acids Res. 1987, 15, 7823. 
(b) Maxam, A. M.; Gilbert, W. S. Methods Enzymot. 1980, 65, 499. 
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Fe(lT) solution was nude from 100 yL of a 100 /iM MPE solution and 
100 /iL of a 100 /iM ferrous ammonium sulfate (FefNH^SOOi^HiO) 
solutioa. After the 4 h equilibratioa, MPE*Fe(H) solutioa (5 ^M) was 
added, and the reactions were equilibrated for 5 min. Cleavage was. 
initiated by the addition of dithiothreitol (5 mM) and allowed to proceed 
for 14 min. Reactions were stopped by ethanol precipitation, resus- 
pended in 100 mM tris-borate— EDT A/80% formarnide loading buffer, 
denatured at 85 °C for 5 min, placed on ice, and immediately loaded 
onto an 8% denaturing polyacrylamide gel (5% crosslink, 7 M urea) at 
2000 V. 

Affinity Cleaving. All reactions were carried out in a volume of 
400 /*L. A polyamide stock solution or water (for reference lanes) 
was added to an assay buffer where the final concentrations were 25 
mM Tris-acetate buffer (pH 7.0), 200 nuM NaCl, 50 ,ug/mL of 
glycogen, and 15 kepm 3'- or 5'-radiolabeled DNA. After the reactions 
were allowed to equilibrate for 4 h, ferrous ammonium sulfate (Fe- 
(NHi^GSOih'SHiO), 10 final concentration, was added. After 
auother 15 min, cleavage was initiated by the addition of dithiothreitol 
(5 mM) and allowed to proceed for 12 min. Reactions were stopped 
by ethanol precipitation, resuspeuded in 100 mM tris-borate— EDT A/ 
S0% formarnide loading buffer, denatured at 85 °C for 5 min, placed 
on ice, and immediately loaded onto an 3% denaturing polyacrylamide 
gel (5% cross-link, 7 M urea) at 2000 V. 

DNase I Footprinting. All reactions were carried out in a volume 
of 40 ptL. We note explicitly that no carrier DNA was used in these 
reactions. A polyamide stock solution or water (for reference lanes) 
was added to an assay buffer where the fiual concentrations were 10 
mM Tris-HCI buffer (pH 7.0), 10 mM KCI, 10 mM MgCh, 5 mM 
CaCb, aud 20 kepm 3'-radio!abeled DNA. The solutions were allowed 
to equilibrate for a minimum of 4 h at 22 °C (the four-ring hairpin was 
allowed to equilibrate for up to 12 h with no noticeable affect on the 
data set). Cleavage was initiated by the addition of4^iLofa DNase 
I stock solution (diluted with 1 mM DTT to give a stock concentration 
of 0,225 u/mL) and was allowed to proceed for 5 nun at 22 °C. The 
reactions were stopped by the addition of 3 M sodium acetate solution 
containing 50 mM EDT A and then ethanol precipitated. The cleavage 
products were resuspended in 100 mM tris-borate— EDT AJ 80% for- 
marnide loading buffer, denatured at 85 °C for 5 min, placed on ice, 
and immediately loaded onto an 8% denaturing polyacrylamide gel (5% 
cross-link, 7 M urea) at 2000 V for 1 h. The gels were dried under 
vacuum at 80 °C, then quantitated using storage phosphor technology. 

Equilibrium association constants were determined as previously 
described.** 1 ' The data were analyzed by performing volume integra- 
tions of the 5'-AGGGA(A)0' ( 5'-TGGGT(Q-3', 5'-TGCGC(T>3', aud 
5'-AOGGA(A)-3' sites aud a reference site. The apparent DNA target 
site saturation, 0 Ifr , was calculated for each concentration of polyamide 
using the following equation: 



tor 'H n r 



(I) 



where f toi and are the integrated volumes of the target aud reference 
sites, respectively, and f w 9 and f n ° correspond to those values for a 
DNase I control lane to which un polyamide has been added. The . 
((L|t« ( , 6\pp) data points were fit to a Laugmuir binding isotherm (eq 2, 
n =» 1) by minimizing the difference between # afV and 9$ u using the 
modified Hill equatiou: 



1 + 



tf) 



where [LJtot corresponds to the total polyamide concentration, K % 
corresponds to the apparent monomeric association constant, and 9m* 
aud 0mu represent the experimentally determined site saturation values 
when the site is unoccupied or saturated, respectively. Data were fit 
using a nonlinear least-squares fitting procedure of KaleidaGraph 
software (version 2.1, Abelbeck software) with 0™*. and 9^* as 
the adjustable parameters. .AJ I acceptable fits had a correlation 
coefficient of R > 0.97. At least three sets of acceptable data were 
used in determining each association constant. All lanes from each 
gel were used unless visual inspection revealed a data point to be 
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obviously flawed relative to neighboring points. The data were 
normalized using the following equation: 

a _ a 

Q — a PP g tnin 

°nonn a a \J) 

Quantitation by Storage Phosphor Technology Autoradiography. 
Photostimulable storage phosphorimagiug plates (Kodak Storage 
Phosphor Screen S0230 obtained from Molecular Dynamics) were 
pressed flat against gel samples and exposed in the dark at 22 °C for 
12-16 h. A Molecular Dynamics 400S Phosphorlmager was used to 



obtain all data from the storage screens. The data were analyzed by 
performing volume integrations of all bands using the ImageQuant v. 
3.2. 
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Abstract: A series of tour hairpin pyrrole -imidazole polyamides. IminiPy-y-PyPyPy-^-Dp. PyPyPy-y-lnilmPy-/?- 
Dp. AchnImPy-y.PyPyPy-/5-Dp. and AcPyPyPy-y-lmImPy-/?-Dp (mi = /V-me t by J imidazole -2 -carboxamide.. Py = 
A^-metliylpyri-ole-2-carboxainide.. Dp = iy^V-dmiethylaniiiiopi*opylamide. y = y-aminobutyric acid, f> ~ /^-alanine, 
and Ac = acetyl), designed for recognition of 5'-( A,T)GG( A,T):-'3' sequences in the minor groove of DNA were 
synthesized using solid phase methodology and analyzed with respect to DNA bunting affinity and sequence specificity. 
Quantitative DNase I footprint titration experiments reveal that die optimal polyaniide fmlinPy-y-PyPyPy-£-Dp binds 
a designated 5'-TGGTT-3' match site with an equilibrium association constant of .AT* = 1.0 x 10* M~ l and the single 
base pair mismatch sites, 5'-TGTTA-3' and 5'-GGGTA-3', with 50-fold and 100-fold-lower affinity, respectively 
(10 mM Tris-HCl. 10 mM KCI It) 111M MgCl 2 , and 5 mM CaCI 2 , pH 7.0 and 22 °C). Polyamides of sequence 
composition AcImlmPy-y-PyPyPy-^-Dp and AcPyPyPy-y-ImhnPy-jS-Dp, which differ only by the position of the 
y-linker, bind witli similar affinities and specificities. Recognition of sequences containing contiguous G-C base 
pairs expands the sequence repertoire available for targeting DNA with pyrrole- iinidazole polyamides. 



Introduction 

Pyrrole -Imidazole polyamides offer a general method for 
the design of non-natural molecules for sequence-specific 
recognition in the minor groove of DNA. 1 " 3 Within the 2:1 
polyamide- DNA model, an imidazole (Im) on one ligand 
opposite a pyiTolecarboxamide (Py) on the second ligand 
recognizes a G-C base pair, while a pyiTolecarboxamide/ 
imidazole combination targets a C-G. base pair. 1 * 3 A pyrrole- 
carboxamide/pyrrolecarboxamide pair is partially degenerate for 
A*T or T-A base pairs. 1 " 3 On the basis of this model, the 
recognition of die sequences 5'-(A,T)G(A,T)C(A ,T)-37 5'- 
(A,DG(A,T)3-37 (A,T) 2 G(A,T):-3' ) 4 and 5'-(A,T)GCGC(A,T)- 
3' * has been achieved. However, sequences containing con- 
tiguous G-C base pairs are notably absent from this list. 

Formation of a hairpin polyamide by covalently linking a 
polyaniide heterodimer with a y-amino butyric acid (y) residue 
provides an approximate 3 00 -fold enhancement in affinity over 
the unlinked polyamides, ImPyPy- and PyPyPy-. 6 Moreover, 
the specificity of die hairpin is greatly improved. The initial - 
placement of the y-amino acid turn was chosen for synthetic 
ease and was not varied. With the development of solid phase 

* Abstracr published in Advance ACS Abstracts. June IS, 1996. 

(1) (a} Watte. W. S.; Dcrvan, P. B. J. Am. Chem. Soc. 1987, 109, 1574- 
I57S. (b) Wade, W. $.; Mrlcsich, M.; Dcrvan, P. B. J. Am. Chcm. Soc. 
1992, 1 14, 8783. (ci Mrlcsich, M.; Wade, W. S.; Dwyer. T. J.; Geieratanger, 
B. H.; Weinmer, D. £.; Dcrvan, P. B. Proc. Natl. Acad. Sci. U.SA. 1992. 

75S6. (d) Wade, W. S.; Mrksich, M.; Dcrvan, P. B. Biochemist™ 1993, 
32.1 138S. 

(2) (a> Pelton, J. C; Wemmer, O. E. Proc. Natl. Acad Sci. U.S.A. 1939, . 
M. 5723. (b» Pclton, J. G.; Weimner, D. E. J. Am. Chem. Soc. 1990. 112. 
1393. (c) Clicn, X.; Ramakrishnau, B.; Rao. S. T.; Sundarjlinghain. M. 
Nature Struct. Bint. 1994, /, 169. 

(3) (at Mrlcsich. M.; Dervan. P. B. J. ,4m. Chcm. Soc. 1993, IIS, 2572. 
(bi Gcierstauger, B. H.; Jacobseu. J. -P.; Mrksich, M.; Dcrvan, P. B.; 
Wemmer, D. E. Biochemistry 1994, 33, 3055. 

W Gcicrotungcr, B. H.;.Dwycr, T. J.; Balhini, Y.; Lown. J. W.; Wemmer, 
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methodology for polyamide synthesis, we now assess the effect 
of varying the position of the y-tura monomer. 7 

In order to explore the recognition of 5'-(A,T)GG(A,T)r3' 
sequences,, a series of four head-to-tail linked hairpin polyajnides 
containing neighboring imidazole rings, InilmPy-y-PyPyPy-/?- 
Dp (1), PyPyPy-y-unimPy-£-Dp (2), AchnimPy-y-PyPyPy-£- 
Dp (3), and Ac'PyPyPy-y-lmImPy-/?-Dp (4), were prepared 
using solid phase methods (Figures I and 2). 7 The polyamides 
are all synthesized with Boe-£-alanine-Pam-resin, previously 
shown as optimal for polyamides.* Each imidazole is expected 
to form a specific hydrogen bond with a guanine amino group 
allowing the recognition of contiguous G-C base pairs (Figure 
1). In addition, the linker turn position is varied within the 
nonacetylated and acetylated pairs of polyamides to determine 
the effect on the sequence specificity and b biding affinity. We 
report here the binding specificity and affinity of die polyamides 
as determined by the complementary techniques, MPE-Fe 11 
footprinting? and quantitative DNase I footprintuig. 10 MPE-- 
Fe I! footprinting verifies that sequence -specific recognition of 
the expected 5'-TGGTT-3' target site has been achieved. In 
addition, DNase I quantitative footprint titration experiments 
reveal that the position of the y-linker does not dramatically 
affect either affinity or specificity of polyamides, especially die 
pair containing acetylated N-termini. 

Results 

Synthesis of Polyamides. The polyamides ImlmPy-y- 
PyPyPy-yS-Dp (1), PyPyPy-y-lmImPy-£-Dp (2), AcImimPy-y- 
PyPyPy-£-Dp (3), and AcPyPyPy-y-ImhnPy-/3-Dp (4) were 

(7) Baird, E. E.; Dervan, P. B, J. Am. Chcm. Soc. 1996. 11$, 6141- 
6146. 

(8) Parks, M. E.; Baird. E. E.; Dei-van, P. B. J. Am. Chem. Soc. 1996, 
UK, 6147-6152. 

(9) (a) Van Dyke, M. W.; Dervan, P. B. Biochemistry 1983 . 22, 2373. 
lb) Van Dyke, M. W.; Dervan. ?. B. Nucleic Acids tics. 1983. //. S555. 

(10) (ai Brenowitz, M.; Senear, D. F.; Shea, M. A.; Ackers. G. tC 
Methyls Enzymot. 1986, J 30, 132. (bt Brenowic2, M.; Senear. D. F.; Shea, 
M. A.; Ackers. G. K. Proc. Natl. Acad. Sci. U.S.A. 1986, .Vjf, 8462. (c) 
Senear, D. F.; Brenowitz, M.; Shea, M. A.; Ackers, G. K. Biochemistry 
1986, 2S, 7344. 
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3 ' -A C C A A-5 1 




5' 3' 

PyPyPy-Y-ImImPy-J3-Dp * TGGTT 
5 ' -T G G T T-3 1 



3 ' -A C C A A-5 ' 

Figure 1. Binding model tor the complexes formed between poly- 
amides lmImPy-y-PyPyPy-0-Dp (I) (A) and PyPyPy^-IinliniV/i-Dp 
(2) (B) and a 5' -TGGTT -3' sequence. Circles with dots represent lone 
pairs of N3 of purines and 02 of pyriinidines. Circles containing an H 
represent the N2 hydrogen of guanine. Putative hydrogen bonds are 
illustrated by doited lines. Ball and stick models are also shown. Shaded 
and noiishaded circles denote imidazole and pyrrole carboxamides, 
respectively. Nonshaded diamonds represent a ^-alanine residue; 




ImlmPy-Y-PyPyPy-p-Dp (1) 




PyPyPy-Y-I™ImPy-P-Dp (2) 




AcPyPyPy-Y-ImlmPy-p-Dp (4) 

Figure 2. Series of polyamides synthesized using solid phase 
methodology. 7 

prepared by solid phase methodology (Figure 2). Four unique 
pyrrole and imidazole building blocks were combined in a 
stepwise maimer ou a solid support using Boo -chemistry 
protocols (Figure 3). For example, poly amide 2, PyPyPy-y- 
lmImPy-£-Dp, was prepared in 14 steps ou the resin, and then 
cleaved with a singie-step aminolysis reaction (Figure 3). All 
polyamides were found to be soluble up to at least I inM 
concentration in aqueous solution. 

Footprinting. MPE-Fe 11 tbotprmting on a 3'* or 5'- 32 P-end- 
labeled 266 base pair EcoRl/PuttU restriction fragment from 
plasnud pMEPGG (25 inM Tris-acetate. 100,/iM bp calf thymus 
DNA,10 uiM NaCl) reveals that the synthetic polyamides 1-4, 
at 10 /<M concentration, bind the designated target site 5'- 
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Figure 3. Solid phase synthetic scheme for PyPyPy-HinIinPy-0-Dp stalling from commercially available Boc-0-Pain-resin: (i) 80% T FA/DC M 
0.4 M PUSH; (ii) Boc-Py-OBt, DIEA, DMF: (iii) 80% TFA/DCM, 0.4 M PhSH; (iv) Boc-Im-OBt (DCC/HOBt), DIEA, DMF; (v) 80% TFA/DCm! 
0.* M PhSH; (vi) Boc-Im-OBt (DCC/HOBt), DIEA, DMF; (vii) 80% TFA/DCM, 0.4 M PhSH; (viii) Boc-y-nminobutyric acid (HBTU, DIEA); (ix) 
$0% TFA/DCM, 0.4 M PhSH; (x) Boc-Py-OBt, DIEA, DMF; (xi) 80% TFA/DCM, 0.4 M PhSH; (xii) Boc-Py-OBt, DIEA, DMF; (xiii) Wo 
TFA/DCM, 0.4 M PhSH; (xiv); pyrcole-2-eaiboxylic acid (HBTU/DIEA): {\v){N.N- dimethylamino)piopylaiiiiiie, .55 °C. (inset) Pyrrole and imidazole 
monomers for synthesis of all compounds described here: pyrrole-2-carboxylic acid 5, imidazole -2 -carboxy lie acid 6, ,b Boc-pyrrole-OBt ester 
7, 7 and Boc— imidazole acid 8. 7 



TGGTT-3' (Figures 4 and 5). la addition several single base 
pair mismatch sites are bound with lower affinity. Quantitative 
DNase 1 footprint titration experiments (10 mM Tris*HCI, 10 
mM KCl 10 mM MgCl : . and 5 mM CaCl 2 , pH 7.0 and 22 °C) 
were performed to determine the equilibrium association 
constants of the four polyamides 1—4 for a designated match 
site, 5'-TGGTT-3', as well as for two single base pan- mismatch 
sites, 5'-TGTTA-3' aud 5'-GGGTA-3' (Table I). 10 The poly- 
amide lmlmPy-y-PyPyPy-/3-Dp binds die target site 5'-TGGTT- 
3' with the liighest affinity (association constant — 1.0 x 
10 s M~ l ) (Figures 6 and 7). The remaining polyamides have 
lower but approximately equal association constants of = 
—2 x 10 7 M~ l for the target site. The uonacetylated polyamides . 
in this series are > 50- fold specific for the 5 '-TGGTT-3' match 
site over either of the single base pair mismatch sites analyzed. 
The acerylated pair of polyamides exhibit lower sequence 
specificity for die analyzed sites. 

Discussion 

Each polyamide within this series specifically binds the five 
base pair designated target sequence 5'-TGGTT-3', as shown 
by MPE'Fe" footprinting experiments, providing die tirst 
example of contiguous G*C recognition in the polyamide -DNA 
motif. Interestingly, the polyamides prefer different mismatch 
sequences, indicating diat die position of the turn alters sequence 
selectivity, although only for the mismatches. 

Quantitative DNase I footprint titration experiments reveal 
tJiat ImImPy-y-PyPyPy-/?-Dp (1) is optimal within this series 
of four polyamides. Tliis hairpin binds a 5'-TGGTT-3' match 
site widi an equilibrium association constant ot K a = 1 x 10" 
M" 1 , while the corresponding hairpin PyPyPy-y-[mlniPy-£-Dp 
(2). which differs oidy in the position of the y turn, shows lower 
affinity (tf a = -2x 10 7 M _l ) for the 5'-TGGTT-3' site. Both 
unacetylated polyamides demonstrate good specificity (> 50- 



fold) for die target match site over die single base pair mismatch 
sites. The acetylated polyamides are similar in affinity to 
PyPyPy-y-lmIuiPy-0-Dp (2), but exhibit lower specificity.. 
AcImmiPy-y-PyPyPy-£-Dp (3) and AcPyPyPy-y-lmlmPy-/?- 
Dp (4) are virtually indistinguishable from each other on the 
basis of affinity and specificity for the analyzed target sequences, 
indicating little preference tor turn position. 

This series of contiguous imidazole-containing polyamides 
is remarkably similar in affinity and specificity to the single 
unidazole-containiug hairpin polyamide, ImPyPy-y-PyPyPy-^- 
Dp, indicating little or no energetic penalty, in this system for 
adjacent imidazoles.* Importantly, the position of die hairpin 
turn does not significantly affect the recognition of the target 
5'-TGGTT-3' match site, aldiough single base pair mismatch 
relative affinities are altered. 

Implications for the Design of Minor Groove Binding 
Molecules. The 2: 1 motif has been used to specifically target 
. several sequences: 5'-TGTCA-3', 1 5'-TGTTA-3? 5'-AAGTT- 
3', 4 aud 5'-TGCGCA-3'. 5 The results reported herein add 
sequences containing two contiguous G*C base pairs to the list, 
expanding the sequence repertoire for DNA recognition by 
polyamides. Furthermore, turn position showed minimal effects 
on the specificity and affinity of the polyamides, indicating a 
new degree of flexibility within the 2:1 motif. The expansiou 
of die polyamide sequence repertoire duough contiguous G-C 
recognition coupled widi solid phase synthetic advances allow- 
ing die rapid assembly aud characterization of polyamides brings 
the goal of sequence-specific recognition of any DNA sequence 
by designed molecules closer to fruition. 

Experimental Section 

Materials. Boc-glycine-(4-carbonylumiiioinethyl)-benzyl-ester-co- 
poly(styrene-divinylbenzene) resin (Boc-G-Pam-resin) (0.2 mmol/g) 
0.2 mmol/g Boc-^-alanine^4-carbonylaininomethyl)-ben2yl-estfir-co- 




Figure 4. MPE-Fe" footprinting experiment on a 3'- :jI P-labeled 266 
bp Eco&l/PuuW restriction fragment from plasinid pMEPGC. The 5'- 
TGGTT-3', 5'-GGGTA-3', and 5'-TGTTA-3' sites are shown on die 
right side of the autoradiogram. All reactions contain 10 kepin restriction 
fragment. 25 mM Tris-acetate, 10 mM NaCl. 100 juM calf thymus DNA 
(bp), and 5 mM DTT. Lane I : A reaction. Lane 2: G reaction. Lanes 
3 and 8: MPE-Fe" standard. Lane 4: tO/tM rmlmPy-y-PyPyPy-^- 
Dp [i). Lane 5, lO^uM PyPyPy-v-ImImPy-/>'-Dp (2). Lane 6: 10/iM 
AclmJmPy-y-PyPyPy-/J-Dp (3). Lane 7: 10/iM AcPyPyPy-v-lmlmPy- 
yS-Dp {4). Lane 9; intact DNA. 

poly(styrene-divinyl benzene) resin (Boc-#-Pam-resin), dieyclohexyl- 
enrbodiimide (DCC), hydnx\ybeiizotriazole(HOBt), 2-(I//-benzotriazol- 
l-yl)-U,3,3-tetramethyluronium uexafluorophosphate (HBTU), Buc- 
glycine, and Boc-/f-alnnine were purchased from Peptides International. 
/V. /V-Diisop ropy lethyla mine ( DIE A), jV.iV-di methyl formamide (DMF), 
iV-methylpyrrolidone (NMP), DMSO/NMP, and" acetic anhydride 
( AcjO) were purchased from Applied Biosystems. Boc-y -ami no butyric 
acid was from NOVA Bioehem, dichloiu methane (DCM) and triethy- 
iamine (TEA) were reagent grade from EM, rhio phenol (PhSH) and 
(dimethylamino propylamine were from Aldrich, tritluoroaeetic acid 
(TFA) was from Halocarbon. All reagents were used without further 
purification. 

l HNMR spectra were recorded tu DMSO-</$ on a GE 300 instrument 
operating at 300 MHz. Chemical shifts are reported in parts per million 
relative to the solvent residual signal. UV spectra were measured on 
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a Hewlett-Packard Model N452A diode array spcciroplioiomck-r. 
Mntri.vas.sislcd. laser desmplion, ionization time of flight mass spec, 
laundry (MALD1-TOF-MS) was carried out ;it the Protein and Peptide 
Microanalylical Facility at the California Institute of Technology. 
I IPLO analysis was performed either on a HP 1090M analytical HPU" 
or on a Beckman Gold system .using a Rainin Mictosoib MV. 5 
//in. 300 /. 4.o mm reversal -phase column ia 0.1% (w/v) TFA with 
accluuitrile as elueat and a tlow rate of 1.0 tnLmin. gradient elulion 
1 .25% acelouitrile/min. Preparatory HPLC wits carried out on a 
lieckman HPLC using a Waters Delta Pak 25 x 100 .mm. 1 00 /i in C, 3 
column equipped with a guard. 0.1% (w/vj TFA, 0.25% acetonemic/ 
min. ISMU water was obtained from a Millipore MilliQ water 
purification system, and all butters were 0.2 /nn filtered. Reagent- 
grade chemicals were used unless otherwise stated. 

Activation of Boc-f-auuiiobu lyric Acid, Imidazole-- 2 -vurboxylic 
acid, and Pyrro(o2-earboxyuc acid. The appropriate amino acid or 
acid (2 mmol) was dissolved in 2 mL of DMF. HBTU (720 mg. 1.9 
mmol) was added followed by DIE A ( 1 mL) and the solution lightly 
shaken for at least 5 min. 

Activation of Boc- Imidazole Acid. Boe- imidazole acid (257 mg, 
I mmol) and HOBt( 135 mg, I mmol) were dissolved in 2 mL of DMF, 
DCC (202 mg, I mmol) was then added, and the solution was allowed 
to stand for at least 5 min. 

Typical Manual Synthesis Protocol. PyPyPy-y-ImIiuPy-/?-Dp. 
Boc-0- Pa in- resin (1.25 g, 0.25 mmol of amine) was shaken in DMF 
for 30 min and drained. The N-Boe group was removed by washing 
with DCM for 2 ;c 30 s, followed by a 1 min shake in 80% TFA/ 
DCM/0.5 M PhSH, draining the reaction vessel, a brief 80% TFA/ ' 
DCM/0.5 M PhSH wash, and 20 min shaking in 80% TFA/DCM/0.5 
M PhSH solution. The resin was washed for 1 min with DCM and 30 
s with DMF. A resin sample (8-10 mg) was taken for analysis. The 
resin was drained completely, Boc-pyrrole-OBt monomer (357 mg, 1 
mmol) dissolved in 2 mL of DMF was added, followed by DIE A (1 
mL), and the resin was shaken vigorously to make a slurry. The 
coupling was allowed to proceed for 45 min. A resin sample (8-10 
mg) was taken alter 40 min to check reaction progress. The reaction 
vessel was washed with DMF for 30 s and dichloro methane for I inia 
to complete a single reaction cycle. Six additional cycles were 
performed, adding successively Boc-Im-OH (DCC/HOBt), Boc-Im-OH 
(DCC/HOBt), Boc-K-amiuobutyric acid (HBTU/DIEA), Boe-Py-OBl, 
Boo-Py-OBt, and pyrrole- 2 -carboxy lie acid (HBTU/DIEA). The resin 
was washed with DMF, DCM, MeOH, and ethyl ether and then dried 
in vacuo. PyPyPy-v-ImImPy-/i-Pam-resin (180 mg, 29 /unol) 12 was 
weighed into a glass scintillation vial, 1.5 mL of (/VjV-dimethylamino)- 
propylamine added, and the mixture heated at 55 °C for 18 h. The 
resin was removed by filtration through a disposable polypropylene 
filter and washed with 5 mL of water, the amine solution and the water 
washes were combined, the solution was loaded on a C13 preparatory 
HPLC column. The polyamide was then eluted in 100 min as a well- 
defined peak with a gradient of 0.25% acetonitrile/min. The polyamide 
was collected in four separate 8 mL fractions, and the purity of the 
individual fractions was verified by HPLC and 'H NMR, to provide 
purified PyPyPy-y-ImlmPy-^-Dp (2) (1 1.2 mg, 39% recovery): UV 
A,™ 246 (31 100), 312 (51 200); l H NMR (DMSCM 6 ) d 10.30 (s, I 
H), 10.26 (s, 1 H), 9,88 (s, 1 H), 9.80 (s, l H), 9.30 (s, I H), 9.2 (br 
s, I H), 8.01 (m, 3 H). 7.82 (br s i H), 7.54 (s. I H), 7.52 (s, 1 H), 
7.20 (d, I H,/= 1.3 Hz), 7.18 (d, I H, J = 1.2 Hz), 7.15 (d, 1 H, ./ 
= 1.3 Hz), 7.01 (d, 1 H, ./ = 1.4 Hz), 6.96 (d, I H. J = 1.4 Hz), 6,92 
(d, I H. J = 1.8 Hz), 6.89 (m, 2 H), 6.03 (t f I H, 2.4 Hz), 3.97 (s. 
3 H), 3.96 (s, 3 H), 3.85 (s, 3 H), 3.82 (s, 3 H), 3.78 (in, 6 H), 3.37 (m, 
2 H), 3.20 (q, 2 H, / = 5.7 Hz), 3.08 (q, 2 H / « 6.6 Hz), 2.94 (q, 2 
H./= 5.3 Hz), 2.71 (d, 6 H./= 5.S Hz), 2.32 (in, 4 H), 1.83 (in, 4 H); 
MALDI-TOF-MS 978.7 (979.1 ealcd for M + H). 

lmImPy-]/-PyPyPy-/f-Dp (i), Polyamide was prepared by machine- 
assisted solid phase synthesis protocols, 7 and 900 mg of resin was 
cleaved and purified to provide 1 as a white powder (69 mg, 48% 
recovery): UV A, lua 246 (43 300), 308 (54 200); l H NMR (DMSOW*) 

(1 h Kent. S. B. H. Annu. Rev. Bioehem. 19X8, 57. 957. 

(12j Resin substitution can be calculated as £ iam (mmol/g) = L&J{\ + 
ioia( ^'kxv - Wnuil x lir J i, where L is the loading (mmol of amine/g of 
resin l, and W is the weight (gmor 1 ) of the growing polyamide attached to 
the resin. See: Barlos, K.; Cuatzi, 0.; Gatos, D.; Stravropoulos, G. Int. J. 
Pept. Protein Res. 1991, 37, 513. 
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ImhnPy-y-PyPyPy-p-Dp(lO^M) 



S • -TCCCGAGCTCACtf^JTCC^^ 

1 ' -ACCGCTCGACTCCCTk£g^CGCTATC^^ 



PyP y Py. Y .l m [ m p y .p. Dp (10uM) 




5 1 -TCGCGAGCTCAGCGA^TTfTGCGATAGCOTGArrAGCGTATGCGTpGGT^ . 
3 * -AGCGCTCGAGTCGCTkCCA^CCTATCGCACTAA^ 



CTTT ftACAATI A- 5 



AcJmimPy-7-PyPyPy-p-Dp (10 ^M) - 



5 ' -TCGCGAGCTCAGCGAjTGG* 

3 * - AGCGCTCGAGTCGClkcgAj|ACG CTATCGCACTAATCGCATACGCAt: 




ICGATAGCGTGATTAGCGTATGCGTpGGT^CGATACGCAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAA^ 



}TT/fT-3 * 

£gA3)CGCTATG CGTCGAACCGCATTAGTACCAGTATCGAC AAAG GAC AC ACTTTyjACAATjA - 5 * 



AcPyPyPy- Y-ImJmPy-p-Dp (10 jxM) 



S • -TCGCGAGCTCAGCG, 
3 1 -AGCGCTCGAGTD 



;cG?/rGGn|TG( 



CGATAGCGTGATTAGCGTATGCGTtGGT^GCGATACGCAGf 
iCGCTATCGCACTAATCGCATACGCj*|gCCA^GCTAT 



CTTGGCGTAATCATGGTCATAGCTGTrrCCTGTGTGAAATtfGTT^ - 3 * 

'AAT&-5 ' 



Figure S. Histograms of cleavage protection (footprinting) data. (Top) Illustration of the 266 bp restriction fragment with the position of the 
sequence indicated. MPE-Fe" protection patterns for polyamides at 10 /iM concentration. Bar heights ;ire proportional to the relative protection 
from cleavage at each band. Boxes represent equilibrium binding sites determined by the published model. 9 Only sites that were quantitated by 
DNase I footprint titrations are boxed. 



Table I. Equilibrium Association Constants (M~ 


I y*.t> 






polyainidc 


match site 


sin 


gle mismatch sites 


5'-aTGGTTt-3' 


5'-tTGTTAt-3' 


5'-tGCGTAg-3' 


(mImPy-y-PyPyPy-£-Dp 

PyPyPy-HnvImPy^Dp 

AcImImPy-y-PyPyPy-/?-Dp 
AcPyPyPy-r-TinrmPy-0-Dp ' 


1.0 y 10* (0.1) 
1.6 x 10 7 (0.2) 
1.3 x 10 7 (0.7) 
2.0 x 10 7 (0.3) 


1.7 x 10 6 (0.6) 
<l x 10' 
. 1.6 x 10* (.1.1') 

1.3 x 10 6 (0.5) 


<l x 10* 
<1 x 10 J 

1.3 x 10° (0.8) 
5 1 x 10* 



■» Values reported arc the mean values measured from at least three footprint titration experiments, with the standard deviation for each data 
indicated in parentheses. h The assays were performed at 22 °C at pH 7.0 in the presence of 10 mM Tris-HCI, 10 uuM K.CI, 10 mM M«Ch an 



mM CaCI- 

d 10.31 (s, I H), 9.91 (s f 1 H), 9.90 (s, 1 H), 9.85 (s, 1 H), 9.75 (s, I 
H), 9.34 (br s, 1 H), 8.03 (in, 3 H), 7.56 (s f 1 H), 7.46 (s ( I H) f 7.21 
(in, 2 H), 7.15 (m. 2 H), 7.07 (it, I H J = 1.2 Hz) f 7.03 (d, 1 H,7 = 
1.3 Hz), 6.98 (d, I H,/ = 1.2 Hz), 6,87 (m, 2 H), 4.02 (m, 6 H), 3.96 
(m, 6 H), 3.87 (m. 6 H), 3.75 (q, 2 H, / = 4.9 Hz), 3.36 (q, 2 H, ./ = 
4.0 Hz), 3.20 (q, 2 H, J = 4.7 Hz), 3.01 (q, 2 H, J = 5. 1 Hz), 2.71 (d, 
6 H, J = 4.8 Hz), 2.42 (m, 4 H), 1.80 (m, 4 H) MALDI-TOF-MS 
978.8 (979.1 calcd for M 4- H). 

AcluiliuPy-i/-Pyl > yPy-/f-Dp (3). Polyamide was prepared by 
manual solid phase protocols and isolated as a white powder (8 ing, 
28% recovery): UV X, tm 246 (43 400), 312 (50 200); l H NMR (DMSO- 
da) d 10.35 (s, 1 H), 10.30 (s, I H), 9.97 (s, 1 H), 9.90 (s, 1 H), 9.82 
(s, I H), 9.30 (s, 1 H), 9.2 (brs, IH), 8.02 (m, 3 H), 7.52 (s, 1 H), 7.48 
(s, I H), 7.21 (m, 2 H), 7.16 (d, I H, ./ = 1. 1 Hz), 7.U (d, I H, ./ = 
1.2 Hz), 7.04 (d, I H.y = 1.1 Hz), 6.97 (d, 1 H,./ = 1.3 Hz), 6.92 (d, 
1H,;= 1.4 Hz), 6.87 (d, 1 H, J = 1.2 Hz), 3.99 (s, 3 H); 3.97 ( s , 3 
H), 3.83 (s, 3 H), 3.82 (s, 3 H), 3.80 (s, 3 H), 3.79 (s, 3 H), 3.47 (q, 2 
H, J = 4.7 Hz), 3.30 (q, 2 H, J = 4.6 Hz), 3.20 (q t 2 H, / = 5.0 Hz), 



set 
and 5 



3.05 (q, .2 H, J = 5. 1 Hz), 2.75 (d, 6 H, J = 4.1 Hz), 2.27 (m, 4 H), 
2.03 (s, 3 H), 1.74 (m, 4 H); MALDI-TOF-MS 1036.4 (1036.1'calcd 
for M + H). 

AcPyPyPy-y-liuIuiPy-/J-Dp (4). Polyamide was prepared by 
machine -assisted solid phase protocols 7 as a white powder (1 4 mg, 48% 
recovery): UV X ltwl 246 (44 400), 3 12 (52 300): 'H NMR (DMS(W«) 
10.32 (s, I H), 10.28 (s, I H), 9.89 (m, 2 H), 9.82 (s, I H), 9.18 (s, 1 
H), 9.10 (br s, I H), 8.03 (m, 3 H), 7.55 (s, I H), 7.52 (s. 1 H), 7.21 
(d. 1 H,y= l.l Hz), 7.l8(d, I H../ = 7.16 Hz), 7.15 (d, 1 H,/ = 1.0 
Hz), 7.12 (d, I H, J = 1.0 Hz), 7.02 (d, 1 H, / = 1.0 Hz), 6.92 (d, 1 
H, / = 1.1 Hz), 6.87 (d, 1 H. ./ = l.l Hz), 6.84 (d, I H, J = 1.0 Hz), 
3.97 (s, 3 H), 3,93 (s. 3 H), 3.87 (s, 3 H), 3.80 (s, 3 H), 3.78 (in, 6 H)i 
3.35 (q, 2H,;= 5.6 Hz), 3.19 (q, 2H,;= 5.3 Hz), 3.08 (q, 2 H, ./ 
= 5.7 Hz), 2.87 (q, 2 H. J « 5.8 Hz), 2.71 (d, 6 H, J = 4,0 Hz), 2.33 
(m, 4 H), 1.99 (s, 3 H), 1.74 (in, 4 H); MALDI-TOF-MS 1036.2 (1036.1 
calcd for M + H). 

Const niction of P la a mid DNA. Using T4 DNA ligase, the plasmid 
pMEPGG was constructed by ligation of an insert, 5'-GATCGC- 
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Figure 6. Quantirarive DNase T foocprinc ritration experiment with 
rmImPy-K-PyPyPy-/E?-Dp (I) on the 3'- J2 P-labeletl 266 base pair EcoRV 
Pvu\l restriction fragment From plasmidpMEPGG. Lane I: A reaction. 
Lane 2: G reaction. Lanes 3 and IS: DNase I standard. Lanes 4—17: 
50 pM, 100 pM t 200 pM, 500 pM, I nM, 2 nM t 5 nAI, 10 nM, 20 nM. 
50 nM, 100 nM, 200 nM, 500 nM, 1 ,«M UiiimPy-j/-PyPyPy-0-Dp 
(1), respectively. Lane 19: intact DNA. The 5'-TGGTT-3', 5'-GGGTA- 
3', and 5'-TGTTA-3' sites which were analyzed are shown on the right 
side of the uutoradiogram. All reactions contain 10 kpm restriction 
fragment, 10 mM Tris-HCi, 10 niM KC\, 10 mM MgCl :> and 5 rnM 
CaCli. 

G AGCTCAGCG ATGGTTTGCG AT AGCGTG ATT AGC- 
GTATGCGTGGGTAGCGATACGC-3' and 5'-GCGTATCGCTAC- 
CCACGCATACGCTAAT-CACGCTATCGCAAACCATCGCTGA- 
GCTCGCGATC-3', into pUC 19 previously cleaved with BmnHl and 
Wmilll Ligation products were used to transform Epicurian Coli XL 
1 Blue competent cells (Slratagene). Colonies were selecteti for 
a-coinplemenlntion on 25 inL Luria-Bertnni medium agar plates 
containing 50 ^g/'inL ainpicillin and treated with XGAL and IPTG 
solutions. Large-scale plasmid purification was pertbnned with Qiagen 
purification kits. Plasmid DNA concentration was determined at 260 
nin using the relation I OD unit =» 50 ^g/mL duplex DNA. The 
plasmitl was linearized with EcoRA, followed by treatment with either 
Klenow, deoxyudenosine 5'-[a- J2 P]triphosphitte (Amershiun), and thy- 
midine 5'-[a- i2 P]triphosphate for .V labeling or calf alkaline phosphatase 
and subsequent 5' end labeling with T4 polynucleotide kinase and 
K-f J2 P]dATP. The 3'- or 5 '-end- labeled fragment was then digested 
with PvuW and isolated by noudenaUiring gel electrophoresis. The 3'- 
or 5 '^'P -end- labeled 266 base pair EcoRl/Puull restriction fragment 
was used in all experiments described here. Chemical sequencing 
reactions were performed according to published protocols. 13 Standard 
protocols were used for all DNA manipulations. 14 

Identification of Binding Sites by MPE*Fe JI Footprinting. All 
reactions were carried out in a total volume of 40 juL with final 




10 

[Polyamide] 

Figure 7. Data for the quantitative DNase T footprint titration 
experiments for the four polyamides 1-4 in complex wirh the 
designated 5'-TGGTT-3' site. The 0 nvrm points were obtained using 
photostirnulable storage phosphor autoradiography and processed as 
described in the Experimental Section. The data points for ludmPy- 
y-PyPyPy-£-Dp (1.J, PyPyPy-y-ImlmPy-/J-Dp (2), AclmJmPy-j/-Py- 
PyPy-jS-Dp (3), aud AcPyPyPy-HinlniPy-#-Dp K) aie indicated by 
and O, respectively. The solid curves are die best-fit Langmuir 
binding titration isotherms obtained from a nonlinear least squares 
algorithm using eq 2. 

concentrations of species as indicated in parentheses. The ligands were 
added to solutions of radiolabeled restriction fragment (10 000 cpin), 
calf thymus DNA (100 bp), Tris-acetate (25 mM, pH 7.0), and 
NaCl (10 inM) aud incubated for 1 h at 22 °C. A 50 MPE-Fe ]I 
solution was prepared by mixing 100 ,«L of a 100 /iM MPE solution 
with a freshly prepared 100 t uM ferrous ammonium sulfate solution. 
Footprinting reactions were initiated by the addition of MPE-Fe" (5 
,«M), followed 5 min later by the addition of dithiothreitol (5 mM), 
and allowed to proceed for 15 min at 22 °C. Reactions were stopped 
by ethanol precipitation, resuspended in 100 inM tris-borate-EDTA/ 
80% fonnamide loading buffer, and electrophoresed on 8% poly aery I - 
. amide denaturing gels (5% cross-link, 7 M urea) at 2000 V for 1 h. 
The gels were analyzed using storage phosphor technology. 

Analysis of Energetics by Quantitative DlNa.se I Footprint 
Titration, All reactions were executed in a total volume of 40 /iL 
with final concentrations of each species as indicated. The ligands, 
ranging from 50 pM to 1 ( tiM, were added to solutions of radiolabeled 
restriction fragment (10 000 cpm), Tris-HCI (10 mM, pH 7.0), K.CI 
(10 mM), MgCl 2 ( 10 mM), aud CaCl 2 (5 mM) and incubated for 4 h 
at 22 °C. Foot print iug reactions were initiated by the addition of 4 ( tiL 
of a stock solution of DNase I (0.025 unit/mL) containing 1 mM 
dithiothreitol and allowed to proceed for 6 min at 22 °C. The reaction 
mixtures were stopped by addition of a 3 M sodium acetate solution 
containing 50 mM EDTA and ethanol precipitated. The reactions were 
resuspended in 100 mM tris -borate- EDT A/80% fonnamide loading 
butter and electrophoresed on 8% polyaerylamide denaturing gels (5% 
cross-link, 7 M urea) at 2000 V for I h. The footprint titration gels 
were dried and quantitated using storage: phosphor technology. 

Equilibrium association constants were determined as previously 
described.* 510 The (.lata were analyzed by performing volume integra- 
tions of the 5'-TGGTT-r, 5'-TGTTA-3'. and 5'-GGGTA-3' sites and 
a reference site. Binding sites are assumed to be independent and 
nonintemeting as they are separated by at least one hill turn of the 
double helix. The apparent DNA target site saturation, 0^, was 
calculated for each concentration of polyamide using the following 
equation: 

(13) (ai Iverson. B. L.; Dervau, P. B. Nucleic Acids te. 1987, J.\ 7823- 
7830. [hi Maxain, A. M.; Gilbert, W. S. Methtnh Enzvmol. l'>X0 f rfj. 499- 
560. 

(14) Sambrook, J.; Fritsch, E. F.; Mauiaiis, T. Molecular Cloning; Cold 
Spring Harbor Laboratory: Cold Spring Harbor, NY, 1989. 
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where / w ami f Kt are Hie integrated volumes of the target and reference 
sites, respectively, and /n,/ and correspond to those values tor a 
DNase I control lane to which no pnlyainide has been added. The 
(I I'll.-. dala points were titled lo a l.angmuir hint liny isutherm 
(eq 2. // = 1) by minimizing the difference between and using 
(he nuitlilied Mill equation: 
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determining each assi»eialion constant. All lanes IVom each gel were 
used unless visual inspection revealed a data point to be ubviuusly 
Hawed relative It* neighboring points.- The data were normalized using 
the following equation: 



,,UHn f) - f) 



0) 



* + cm* 



where [L| k „ corresponds to the tola! potyamide concent rat u>it. Kj 
corresponds to the association constant, and 0 ain and 0 tiax represent 
the experimentally determined. site saturation values when the site is 
unoccupied or saturated, respectively. The concentration of DNA used 
for quantitative footprint titrations is <50 pM, which justifies the 
assumption that free I i gami concentration is approximately equal to 
total ligand concentration."' Data were fitted using a nonlinear least 
squares fitting procedure of KaJei da Graph software (version 2.1, 
Abelbeek software) running on a Power Macintosh 6 1 00/60 A V 
computer with A* at tf tim , and & mia as the adjustable parameters. The 
goodness -of- fit of the binding curve to the "data points is evaluated by 
the correlation coefficient, with R > 0.97 as the criterion for an 
acceptable fit. At least three sets of acceptable data were used in 



Quantitation by Storage Phosphor Technology Autoradiography. 
Photostt mul able storage phosphoriinaging plates (Kodak Storage 
Phosphor Screen S0230 obtained from Molecular Dynamics) were 
pressed flat against gel samples and exposed in the dark at 22 °C for' 
12- 16 h. A Molecular Dynamics 400$ Phosphorlmager was used lo 
obtain ail data from the storage screens. The dala were analyzed by 
performing volume integral ions of all bands using the ImageQuanf 
version 3.2 software running on an AST Premium 386733 computer. 
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Abstract: The sequence -specific recognition of the minor groove of DNA by pyrrole— imidazole polyamides has 
been extended to 9-13 base pairs (bp). Four polynmides. ImPyPy-Py-PyPyPy-Dp. ImPyPy-G-PyPyPy-Dp. (mPyPy- 
£-PyPyPy-Dp, and ImPyPy-y-PyPyPy-Dp (Im = yV-inediylhnidazole, Py = iV-methylpyrrole, Dp = Nff- 
dimethylammopropylamide, G = glycine, /? = .^-alanine, and y = y-aniinobutyric acid), were synthesized and 
characterized with respect to their DNA-binding affinities and specificities at sequences of composition 5'-(A,T)G- 
(A,T) s C(AT)-3' (9 bp) and 5'-(A.T) s G(A,T)C(A.T) 5 -3' (13 bp). In both sequence coutexts, tlie £-a la nine -linked 
compound ImPyPy-/?-PyPyPy-Dp has the highest binding affinity of the four polyamides, binding the 9 bp site 
5'-TGTTAAACA-3' (K a = 8 x rO* M" 1 ) and the 13 bp site 5'*AAAAAGACAAAAA-3' (Ka = 5 x 10* JvT 1 ) with 
affinities liigher than the formally /V-me thy 1 pyrrole -linked polyamide ImPyPy-Py-PyPyPy-Dp by factors of and 
-85., respectively (10 mM Tris-HCl, 10 mM KCl, 10 mM MgCl 2 , and 5 mM CaCl 2) pH 7.0). The binding data for 
ImPyPy-y-PyPyPy-Dp, which has been shown previously to bind DNA in a "hairpin" conformation, indicates that 
y-aminobutyric acid does not effectively link polyamide subunits in an extended conformation. These results expand 
the binding site size targetable with pyrrole- imidazole polyamides and provide structural elements that will facilitate 
the design of new polyamides targeted to other DNA sequences. 



Introduction 



The development of 2: 1 pyrrole— imidazole polyamide— DNA 
complexes provides a new model for the design of molecules 
for sequence-specific recognition in the minor groove of DNA. 
The polyamide ImPyPy-Dp was shown to specifically bind the 
mixed A.T/G.C sequence 5'-(A,T)G(A,T)C(A,T)-3' as a side- 
by-side antiparallel dimer. 1 In this complex, each polyamide 
makes specific contacts with one strand on the floor of the minor 
groove such tliat the sequence specificity depends on the 
sequence of side-by-side ammo acid pairings. A side-by-side 
pairing of Imidazole opposite pyrrole recognizes G*C base pairs, 
while a side -by-side pairing of pyrrole opposite imidazole 
recognizes C-G base pairs. 1 A side-by-side pyrrole— pyrrole 
pairing is partially degenerate and targets both A-T and T*A 
base pairs. 1,2 The generality of the 2:1 model has been 
demonstrated by targeting other sequences of mixed A.T/G,C 
composition. 3 " 5 ImPyPy-Dp and distamycin (PyPyPy) bind 
simultaneously to a 5'-( A.T)G(A,T)j-.V site as an antiparallel 



•Abstract published in Advance ACS Abstracts. June 15, 1996. 

(0(a) Wade. W. S.; Mrksieh, M; Dcrvan, P. B. J. Am. Chem. Soc. 
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Figure 1. Ribbon models of (a; 9 bp ''overlapped" and (b) 13 bp 
"slipped" 2: 1 polyamide-DNA complexes. 

heterodimer/ A PylmPy polyamide and distamycin bind 5'- 
(Aj) 2 G(A,T) 2 -3 / 4 and ImPylmPy-Dp targets 5'-<AJ)GCGC- 
. (A.T)-3'. S 

The binding affinity and sequence specificity of a uoncovalent 
antiparallel homodinieric or heterodimeric polyamide-DNA 

<£> 1996 American Chemical Society 
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(3) 2 * 5'-TGTTAAACA-3' (3) 2 • 5'- AA AAAGACAAAAA-3' 
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Figure 2. (Top) Complexes of ImPyPy-/?-PyPyPy-Dp with the targeted sites fa) 5'-TGTTA AACA-3' (9 bp "overlapped^ ami- (b) 5'- 
AA AAAGACAAAAA-3' (13 bp, "slipped"). Circles with dors represent tone pans on N3 of purities and 02 of pyrimidines. Circles containing an 
H represent the N2 hydrogen of guanine. Putative hydrogen bonds are illustrated by dashed lines. (Bottom) Complexes of TmPyPy-X-PyPyPy-Dp, 
where X = Py, G, and /?, with (a) 5'-TGTTAAACA-3' and (b) 5'-AAAAAGACAAAAA-3'. The shaded and light circles represent imidazole and 
pyrrole rings, respectively, and the diamond represents the internal amino acid X. The specifically targeted guanines are highlighted. 



complex can be increased by covalently linking die two 
polyamides A 7 Tlie DNA -binding properties of die polyamides 
ImPyPy-G-PyPyPy-Dp, ImPyPy-0-PyPyPy-Dp, and ImPyPy- 
y-PyPyPy-Dp, in wliich the terminal carboxyl group of ImPyPy 
and the terminal amine of PyPyPy-Dp are connected with 
glycine (G), ^-alanine (/?), and y -amino butyric acid (», 
respectively, were recently reported. 7tS The y -amino butyric 
acid- linked polyamide bound the designated target site 5'- 
TGTTA-3' witli high affinity and sequence specificity and 

(6) (a) Mrksich, M.; Dervun, P. B. J. Am. Chem. Soc. 1993, / J J, 9892- 
9899'. (d) Dwyer, T. J.; Geierstamjer. B. H.; Mrksich. M.; Dei-van, P. B.; 
Wemmcr, D. E. J. Am. Chem, Soc. 1993, US, 9900. (c) Mrksich, M.; 
Dervan, P. B. J. Am. Chem. Soc. 1994, 116. 3663. (ch Singh, M. P.; Ptouvier, 
B.; Hill, G. C; Gueck, J.; Pon, R. T.: Lown, J. W. J. Am. Chem. Soc. 
1994, 116, 2006. 

(7) Mrksich, M.; Parks, M. E.; Dervan, P. 8. J. Am. Chem. Soc. 1994, 
l!6 % 7983. 

(8) Mrksich, M. PhX). Thesis. California Institute of Technology, 1994. 



exhibited a binding isotherm in quantitative footp rutting experi- 
ments consistent with formation of an intramolecular "hairpin" 
complex in which the polyamide folds back on itself. 7 Modeling 
suggested that die glycine- and ^-alanine -linked polyamides do 
not favorably bind as "hairpins'* in the minor groove of DNA. 
Moreover, these polyamides exhibited cooperative binding 
isotherms in quantitative footprinting experiments, consistent 
widi two polyamides binding in extended conformations as 
intennolecular (timers?* It appeal's that the glycine- and 
^-alanine -linked polyamides disfavor binding in the hairpin 
conformation and prefer to bind in an extended conformation. 7 ^ 
In a formal sense, there are multiple extended binding motifs 
(and hence multiple buttling site sequences) for polyamides of 
sequence composition Im(Py) r -Dp. as discussed below. 

"Overlapped" and "Slipped" Binding Modes. We report 
here the DNA-binding affinities of four polyamides having the 
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yenoral sequence hnPyPy-X -I'y I VPy -Dp. where X = Py. G.p. 
or y. to (he 9 bp silo 5'-'l'Ci TTAAACA-3' and to the 13 bp sites 
5'-AAA AA( i A( .'AAAAA-3' mid 5'-ATATAGACATATA-3'. 
The poly amides having internal ^methyl pyrrole, glycine, and 
/if -a [a nine residues were anticipated to bind the 0 and 13 bp 
sites in an extended conformation.- It was not clear at (Lie outsel 
if the y-aniinobi:iyric actd-linked polyamide ImPyPy-y-PyPypy- 
Dp would bind exclusively in an extended or "hairpin" 
couloruiation U» the targeted sites. For ImPyPy-X-PyPyPy-Dp 
polyamides binding in at) extended conformation, the polya- 
mide-DNA complexes expected to form at the 9 bp and 13 bp 
target sites represent two distinct binding modes, which we refer 
to as "overlapped" and "slipped", respectively. In the "over- 
lapped" (9 bp) binding mode, two ImPyPy-X-PyPyPy-Dp 
polyamides bind directly opposite one another (Figures la and 
2a). The "slipped" (13 bp) binding mode integrates the 2:1 
and 1:1 polyamide-DNA binding motifs at a single site. In 
this binding mode, the ImPyPy moieties of two ImPyPy-X- 
PyPyPy-Dp polyamides bind die central 5'-AGACA-3' sequence 
in a 2: 1 maimer as in the ImPyPy homodimer, 1 and the PyPyPy 
moieties of the polyamides bind the all-A,T flanking sequences 
as in the I : I complexes of distamycin (Figures 1 b and 2b). The 
structure of the complex formed by the polyamide ImPyPy-G- 
PyPyPy-Dp with a 1 3 bp target site lias been characterized by 
2D NMR. y 

in the 9 bp "overlapped" and 13 bp "slipped" binding sites 
described above., the G*C and C-G base pairs are separated by 
one and live A.T base pairs, respectively. While we have 
concentrated here on these sites, we note tliat "partially slipped" 
sites of 10, 11, and 12 bp in which the G*C and C*G base pairs 
are separated by two, three, and four A,T base pairs, respec- 
tively, are also potential binding sites of the polyamides studied 
here. 

Studies of the energetics of distamycin binding have shown 
that while the binding affinities are similar for complexatiou to 
poly(d(A-T)]-poly[d(A-T)] and poly[d(A)] -poiy[d(T)], die ori- 
gins of these binding affinities are different. 10 Binding to the 
alternating copolymer is enthalpy driven, while binding to die 
homopolymer is entropy driven. 10 However, not all 5 bp sites 
(A,T) 5 are bound with equal affinity. By quantitative tbotprint- 
ing experiments, the distamycin analog Ac-PyPyPy-Dp (Ac = 
acetyl) was shown to bind the sites 5'-AATAA-3' and 5'- 
TTAAT-3' with 2- fold and 14-fold lower affinity, respectively, 
relative to the site 5'-AAAAA-3'. IC On the basis of this result, 
we anticipated that the 13 bp site 5'-AAAAAGACAAAAA-3' 
may be bound with higher affinity tlian the 13 bp site 
5'-ATATAGACATATA-3'. 

The binding affinities of polyamides 1-4 (Figure 3) for the 
three targeted sites 5'-TGTTAAACA-3', 5'-AAAAAGA- 
C AAAAA-3', and 5'-ATATAGACATATA-3' were determined 
by quantitative DNase I footprint titration experiments. 

Results 

Synthesis of Polyamides. The t polyamides ImPyPy-Py- 
PyPyPy-Dp (l). 11 ImPyPy-G-PyPyPy-Dp (2), 7 ImPyPy-£-Py- 
PyPy-Dp (3). 7 and ImPyPy-y-PyPyPy-Dp (4) 7 were prepared 
as described previously. 



(9) Geierstanger. B. H.; Mrksieh. M.; Dervan, P. B.; Wenuner, D. E. 
Nature Struct. Biol. 19M6. i, 321. 

(10) Breslauer, K. J.; Remera, 0. P.: Chou, W.-Y.; Ferrante, R.; Curry , 
J.; Zaunezkowslci, D.; Snyder, J. G.; Marky. L. A. Proc. Natl. Acad. Sci. 
U.S.A. 1987, 84, 8922. (b) Marky, L. A.; Breslauer, K. J. Proc. Natl. Acad. 
Sci. U.S.A. 1987, ,Y«/, 4359. (c| Marky, L. A.; ICupke, IC J. Biochemistrv 
1989. 28. 9982. 

(Hi Kelly, J. J.; Baird. E. E.; Dervan, P. B., Proc. Nutt. Acad. Sci U.S.A., 
in press. 




Figure 3. Structures of die four polyamides ImPyPy-X-PyPyPy-Dp. 
where X = yV-mediylpyrrole (Py), glycine (G), ^-alanine and 
y-aminobutyric acid (y). 



Footprinting. Quantitative DNase I footprinting 1113 on die 
3'- 32 P-labeied 281 bp pJT3 AJIWFspl restriction fragment 
(Figures 4 and 5) (10 mM Tris-HCL 10 mM KC1, 10 mM 
MgCl 2 , 5 mM CaCl 2j pH 7.0, 22 °C) reveals that, of the four 
polyamides ImPyPy-X-PyPyPy-Dp, three (X = Py, G, 0) bind 
to bodi the 9 bp "overlapped" site 5'-TGTTAAACA-3' and the 
13 bp "slipped" site 5'-AAAAAGACAAAAA-3' with equilib- 
rium a ssocia tie n constants (A' a ) greater than 5 x 10 7 M" 1 (Table 
1) and display cooperative binding isodierms (eq 2, ;/ = 2) at 
these sites consistent with binding as intermolecular difiiers 
(Figure 6). The fact that the polyamides ImPyPy-G-PyPyPy- 
Dp and ImPyPy-/?-PyPyPy-Dp bind in die 9 bp "overlapped" 
binding mode indicates that die internal glycine and /? -a la nine 
amnio acids are accommodated opposite a second ligand in a 
2:1 polyamide— DNA complex. 

The polyamide ImPyPy-y-PyPyPy-Dp binds the site. 5'- 
TGTTAAACA-3' with an equilibrium association constant K A 
1 x 10 s M" 1 and also binds the site S'-AAAAAGACAAAAA- 
3' with lower affinity (K 3 = 6 x 10 6 fvT 1 ). This compound 
displays binding isotherms (eq 2, // = 1) at these sites (Figure 

6) , consistent with binding as an intramolecular hairpin to the 
5 bp "matched" sites 5'-TGTTA-3' and 5'-AAACA-3' (Figure 

7) and to the 5 bp "single base pair mismatch" site 5'-AGACA- 
3', 7 Significantly, it appears from tiiese results that ImPyPy- 
y-PyPyPy-Dp does not effectively link polyamide subunits in 
an extended conformation. 

Binding Affinities. Comparison of die binding affinities of 
the four polyamides IniPyPy-X-PyPyPy-Dp, where X = Py, G, 

and y, reveals that the internal amino acid X has a dramatic 
effect on complex stabilities (Table 1, Figure 6). The formally 
A^-metliylpyrrole-1 inked polyamide hnPyPy-Py-PyPyPy-Dp b Litis 

(121(a) Galas, D.; Schmiiz, A. Nucleic Adds ties. 1978. j, 3157. (b) 
Fox, K. li.; Waring, M. J. Nucleic Acids Res. 1984, 12, 9271. 

(I3j(aj Breiiowitz. M.; Senear, D. F.: Shea, M. A.; Ackers, G. K. 
Methods Enzymol. 1986, 130, 132. (bl Breiiowitz, M.: Senear, D. F.; Shea. 
M. A.; Ackers, G. K. Proc. Natl. Acad. Sci. U.S.A. 1986, t VJ, 8462. (c) 
Senear, D. F.; Breiiowitz, M.; Shea, M. A.; Ackers. G. K. Biochemistry 
1986, 2S. 7344. 
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5' -GCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTArTACGCCAGCTGGCGAAAGGGGGATG 
3 ' -CGTTGACAACCCTTCCCGCTAGCCACGCCCGGAGAAG 

CCAGGGTTTTCCCAGTCA.CX;aCGM^GTAAAA^ 

GGTCCC AA ^AC GGTC AGTGCTG C AAC A TTTTGCTGCCGG*] 'C AC IT AAGCTCG AGCCATGGtX't : l "I TGCATCGC ATCGCC ACGTTTTTCTGTTTTT CCG 
TCGACGCCGCATATA<^CATATAGGGCCGTCGACGCTGTTAAACAGGCTCGACGCCAGCTCGTCCTAGCTAGCGTC 

AGCTCCCCCC TATATCTGTATATCCCGGCAGCTGCGACAATTTCT CCG AGCTGCGGTCGAGC AGG ATCG ATCGCAGCATCGC AGAATT - 5 ' 

Fitjure 4. Sequence of the 2SI bp pJT3 AflWh'siA restriction fragment. The three binding sites thai were analyzed in quantitative footprint titration 
experiments are indicated. 

Table I. Association Constants (M l ) for Polyamides ImPyPy-X-PyPyl'y-Dp. Where X = /V-McUiylpyrrolu (Py), Glycine (G). ^'-Alanine (//), 
and y-Amiuobuiyric Acid {yf J> 







pol yam tele 




binding site 


IniPyPy-Py-PyPyPy-Dp 


fmPyPy-G-PyPyPy-Dp 


fmPyPy-/>'-P>PyPy-Dp 


ImPyPy-y-PyPyPy-Dp 


5'-TGTTAAACA-3' 

5'-AAAAAGACAAAAA-3' 

5'-ATATAGACATATA-3' 


9.7 (±2.3) x I0 7 
5.4(±i.5) x 10 7 
3.6 (±0.5) x 10 7 


1.4 (±0.1) x 10* 
1.1 (±0.1) x 10 s 
"6.6 (±0.4) x I0 h 


7.S(±0/0 x 10 s 
>4.7 (±0.7) x 10* 
1.0 (±0.1) x 10* 


1.4 (±0.3) x 10 s 
6.4 (±0.6") x 10* 
4.6(±0.j) x tO* 



0 The repotted association constants are the mean values obtained from three DNase T footprint titration experiments. The standard deviation for 
each value is indicated in parentheses. * The assays were carried out at 22 °C at pK 7.0 in the presence of 10 mivf Tris-HCI, 10 inM KCI 10 mM 
MgCl 2 , and 5 rruM CaCh. 



5'-TGTTAAACA-3' and 5'-AAAAAGACAAAAA-3' with equi- 
librium association constants — 1 x 10* M~ l and 5 x U) 7 
M -1 , respectively. The glycine-1 inked polyaniide ImPyPy-G- 
PyPyPy-Dp binds both 5'-TGTTAAACA-3' and 5'-AAAAA- 
GACAAAAA-3' with affinities similar (equal and 2 -fold higher, 
respectively) to those of ImPyPy-Py-PyPyPy-Dp. In contrast, 
die ^-alanine linked polyamide ImPyPy-/?-PyPyPy-Dp binds 5'- 
TGTTAAACA-3' and 5'-AAAAAGACAAAAA-3' with affmi- 
ties higher than those of ImPyPy-Py-PyPyPy-Dp by factors of 
approximately 8 (K* = 3 x 10 s M _t ) and 85 = 5 x I0 y 
M _i )i respectively. Relative to ImPyPy-Py-PyPyPy-Dp, tlie 
hairpin-forming polyamide ImPyPy-y-PyPyPy-Dp binds 5'- 
TGTTAAACA-3' (a matched hairpin binding site) and 5'- 
AAAAAGACAAAAA-3' (a mismatched hairpin binding site) 
with equal and 8 -fold- lower affinities, respectively. 

Specificity for 5VAAAAAGACAAAAA-3' versus 5'- 
ATATAGACATATA-3'. Comparison of the binding affinities 
of the four polyamides at the 13 bp sites 5'-AAAAAGA- 
C A AAA A -3' and 5'-ATATAGACATATA-3' indicates that the 
specificity between the sites depends on the internal amino acid 
(Table 1). ImPyPy-G-PyPyPy-Dp and lmPyPy-0-PyPyPy-Dp 
are approximately 20-fold and > 5-fold specific respectively for 
5'-AAAAAGACAAAAA-3' versus S'-ATATAGACATATA- 
3'. The polyamides miPyPy-Py-PyPyPy-Dp and ImPyPy-y- 
PyPyPy-Dp bind 5'-AAAAAGACAAAAA-3' and 5'-ATATA- 
GACATATA-3' with similar affinities. 

Discussion 

Implications for the Design of Minor Groove Binding 
Molecules. The results presented here reveal that ^-alanine is 
an optimal linker for joining two three-ring subunits in an 
extended conformation., providing a use till structural motif for 
the design of new polyamides targeted to sites longer than 8 
bp. Recently, it has been showu that as the length of a 
polyamide having the general sequence Im(Py) r -Dp increases 
beyond live rings (corresponding to a 7 bp binding site), the 
binding affinity ceases to increase with increasing polyamide 
length, indicating that the /V-methyliiiiidazole and ^-methy (pyr- 
role residues fail to maintain an appropriate base-pair register 
across the entire length of the polyamide— DNA complex. 11 The 
higher binding affinities observed here for lmPyPy-/?-PyPyPy- 
Dp relative to ImPyPy-Py-PyPyPy-Dp indicate that the flexible 
/J-atanine linker relieves the register mismatch, allowing both 
three -ring sub writs to bind optimally. Notably, higher binding 
affinities are observed for ImPyPy-/?-PyPyPy-Dp versus Im- 



PyPy-Py-PyPyPy-Dp despite the higher conformational entropy 
and lower aromatic surface area of the jS-alanine-hnked polya- 
mide. The observation here that /^-alanine effectively links 
polyamide subunits within 2:1 polyamide -DNA complexes is 
consistent with the previously reported finding that /? -alanine 
effectively links polyamide subunits within 1:1 polyamide— 
DNA complexes. 14 

From the standpoint of binding specificity, die observation 
here that a single compound can bind in multiple binding modes 
is problematic. The next generation of pyrrole— imidazole 
polyamides targeted to binding sites greater than 8 bp in length 
should incorporate constraints that specify a single binding 
mode. 

The previously described y-aminobutyric acid-based "hairpin" 
motif 7 complements the /? -alanine -based "extended" motif 
.described here. For ImPyPy-y-PyPyPy-Dp, the binding iso- 
therms and affinities obseived here are consistent with the 
previous report that y-aminobutyric acid effectively links 
polyamide subunits in a "hairpin" conformation 7 and indicate 
that y-aminobutyric acid does not effectively link polyamide 
subunits in an extended conformation. Significantly, these 
observations indicate that (I) extended bindhig modes will not 
compromise the sequence specificity of liairpm-fomiing, y-ami- 
nobutyric acid- linked polyamides, and (2) ^-alanine and y-ami- 
nobutyric acid linkers could be used within a single polyamide 
with predictable results. 

The results reported here expand the b hiding site size 
targetable with pyrrole- imidazole polyamides and provide 
structural motifs that will facilitate the design of new pyrrole— 
imidazole polyamides targeted to other sequences. 

Experimental Section 

Materials. Restriction endo nucleases were purchased from either 
New England Biolabs or Boeringer-Maiinheim and used according to 
the manufacturer's protocol. Escherichia coli XL-t Blue competent 
cells were obtained from Stratagene. Sequenase (version 2.0) was 
obtained from United States Biochemical, and DNase I was obtained 
from Pharmacia. [a- w P]Thymidine 5'-triphosplmle (>3000 Ci/mmol) 
and [a- 3: PJdeoxyadenosine 5' -triphosphate (>6000 Ci/mmol) were 
purchased from DuPont NEN. 

Const ruction of Plus mid DNA. Plasinid pJT3 was prepared by 
standard methods. Briefly, plasinid pJTl was prepared by hybridization 
of two complementary sets of synthetic oligonucleotides: 5'-CCOO- 

(14) (a) Youngquist. R. S.; Dervaiu P. B. J. Am. Chem. Sov. 1987, 109. 
7564. (b) Griffin, J. H.; Dervan. P. B. Unpublished results. 




ImPyPy-fS-PyPyPy-Dp 






Figure S. Storage phosphor an to radiograms of 8% denaturing polyacrylamide gels used to separate the fragments generated by DNase I digestion 
in quantitative footprint titration experiments. Lanes I and 2, A and G sequencing lanes. Lanes 3 and 21: DNase [ digestion products obtained in 
the absence of polyamide. Lanes 4-20: DNase [ digestion products obtained in the presence of 0.1 niM (0.0 1 nM). 0.2 nM (0.02 nM), 0.5 nM (0.05 
nM). 1 «M (0.1 nM), 1.5 nM (0.15 nM), 2.5 nM (0.25 nM), 4 nM (0.4 nM), 6.5 nM (0.65 nM) ( 10 nM (I nM), 15 nM (1.5 nM), 25 nM (2.5 nM), 
40 nM (4 nM), 65 nM (6.5 nM), 100 nM (10 nM), 200 nM (20 nM), 500 nM (50 nM), 1 ,«M (0.1 ;iM) (concentrations used for polyamide 
ImPyPy-^-F^yPyPy-Dp only are in parentheses). Lane 22: intact DNA. The targeted binding sites are indicated on the right side of the autoradiograins. 
All reactions contain 15 kcpm restriction fragment, 10 mM Tris-HCI. 10 mM KCl. 10 mM MgCl 2 , and 5 mM CaCI 2 . 
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b ) ImPyPy-X-PyPyPy-Dp • 5-AAAAAGACAAAAA-3* 
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Figure 6. Data obtained from quantitative DNase I footprint titration 
experiments showing the effect of the internal amino acid X on binding 
of the polyamides ImPyPy-X-PyPyPy-Dp ro rhe sites (a) S'-TGT- 
TAAAC A -3' and (b) 5'-AAAAAGACA AAAA-3', where X = G (□), 
# (•). Y (O), and Py (a). The (& MMm% [L] m ) data points were obtained 
as described in the Experimental Section. Each data point is the average 
value obtained from three quantitative footprint titration experiments, 

S'-TGTTAAAC A-3 • 



3'-ACAATTTG T-5' 



S'-TGTTAAAC A-3 1 



3'-ACAATTTG T-5 ' 

Figure 7. Model for the complex formed by TmPyPy-)/-PyPyPy-Dp 
with 5'-TGTTAAACA-3' (5 bp, "hairpin"). 

G AACGTAGCGTACCGGTGC AAAA AG AC AAA AAGG • 
CTCGA-3' and S'-GCCGTCGAGCCTTTTTGTCTTTTTGCACC- 
GGTACGCTACGTTC-3', and S'-CGCCGCATATAGACATATAG- 
GGCCCAGCTCGTCCTAGCTAGCGTCGTAGCGTCTTAAG-3' and 
5'-TCGACTTAAGACGCTACGACGCTAGCTAGGACGAGCT- 



(KiCiCCCrATATGTCTATATCr-.r. The ro^uHin^ oligiHiuclcoiidc 
duplexes wore plu>sphorylaleil with ATI* ami T4 polynucleotide kinase 
ami ligaled to the large pUCl9 AtaVSall restriction fragment using 
T4 DNA ligase. E. coll XL- 1 Blue competent cells were then 
transformed with the ligated plasmid. and plasmid DNA from ampi- 
cilliu-resislanl white colonics isolated using a Promega Maxi-Prep kit. 
Plasmid p.N'3 was prepJired in a similar manner, except that the 
following synthetic oligonucleotides were hybridized and cloned into 
the large ApaVSal\ fragment ofpJTl: .V-GTC'GACGtTG'ITAAA- 
C A G G C T </ G A C G C C A G C T Cti 'IT C T A G C T A G C G T - 
C'tiTAGCCiTCTTAAGAG-3' and S'- TCCJAC TC "IT AAGACGCTAC- 
G A CGCTAGCTAGG AC G A G C TG G C G T C G A G C C T - 
GTTTA AC AGCGTCG ACGGCC-3'. The presence o f t he desi red insetl 
was determined by restriction analysis and diileoxy sequencing. 
Plasmid DNA concentration was determined at 260 nm using the 
relation 1 CD unit =» 50 /Jg/mL duplex DNA. . 

Preparation of '*P- End- La be led Restriction Fragments. Plasmid 
pJT3 was digested with AJRI, labeled at the 3 '-end using Sequenase 
(version 2.0), and digested with Fspl. The 281 bp restriction fragment 
was isolated by nondenaturing gel electrophoresis and used in all 
quantitative footprinting experiments described here. Chemical se- 
quencing reactions were performed as described. 13,16 Standard tech- 
niques were employed for DNA manipulations. 17 

Quantitative DlVase I Footprint Titration. 1 *- 13 All reactions were 
executed in a total volume of 40 jiL. We note explicitly that no carrier 
DNA was used in these reactions. A polyamide stock solution or H?0 
(for reference lanes) was added to an assay buffer containing radio- 
labeled restriction fragment (15 000 epm), affording final solution 
conditions of 10 mM Tris-HCI,. 10 mM KC1, 10 mM MgCl 2 , 5 mM 
CaCI 2t pH 7.0, and either (i) 0.1 nM to 1 /(M polyamide, for all 
polyamides except ImPyPy-0-PyPyPy-Dp, (ii) 0.01 nM to 0.1 ( uM 
polyamide for ImPyPy-£-PyPyPy-Dp, or (iii) no polyamide (for 
reference lanes). The solutions were allowed to equilibrate for 5 h at 
22 °C. Footprinting reactions were initiated by the addition of 4 fiL 
of a DNase I stock solution (at the appropriate concentration to give 
—55% intact DNA) containing I mM dithiothreitol and allowed to 
proceed for 7 mm at 22 °C. The reactions were stopped by the addition 
of 10 /(L of a solution containing 1.25 M NaCl, 100 mM EDTA, and 
0.2 ing'mL glycogen, and ethanol precipitated. The reaction mixtures 
were resuspended in IX TBE/80% formamide loading butler, denatured 
by heating at 85 °C for 10 min, and placed on ice. The reaction 
products were separated by electrophoresis on an 8% polyacrylamide 
gel (5% cross-link, 7 M urea) in IX TBE at 2000 V. Gels were dried 
and exposed to a storage phosphor screen (Molecular Dynamics). 

Quantitation and Data Analysis. Data from the footprint titration 
gels were obtained using a Molecular Dynamics 400S Phosphorlmager 
followed by quantitation using ImageQuant software (Molecular 
Dynamics). Background-corrected volume integration of rectangles 
encompassing the footprint sites and a reference site at which DNase 
I reactivity was invariant across the titration generated values for the 
site intensities (/ S i to ) and the reference intensity {!„{). The apparent 
fractional occupancies (0^) of the sites were calculated using the 
equation 



where Z 0 ^ and /° ri: rare the site and reference intensities, respectively, 
from a control lane to which no polyamide was added. 

The ([L] M , 0 W ) data points were fitted to a general Hill equation 
(eq 2) by minimi zing the difference between &^ and &r lt : 



(2) 



where [L]^ is the total polyamide concentration. K a is the apparent 
association constant, and 6 m in and 0 mai ate the experimentally deter- 

(iit Iverson, B. L; Dervan, P. B. Nucleic Adds Res. 1 987, IS, 7823 - 
7830. 

(16) Maxam, A. M.; Gilbert. W. S. Methods- Enzwnol. 1*)80, rfj, 499- 
560. 

(17l Sambrook, J.; Fritsclu E. F.; Maniatis, T. Molecular Cloning Cold 
Spring Harbor Laboratory: Cold Spring Harbor, NY. 1989. 
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mined site salutation v;ilucs when I lie she is unoccupied or saluraled. 
respec lively. The data were lilted usuiy a nonlinear lensl squares tilling 
procedure wilh K :i , 0 u)lMi and f) mu as (lie adjustable parameters, and with 
either // = 2 or// = I depending on which value ol // gave the belter 
111. We nole explicitly I hat Irealrnenl of the data in this manner does 
nol repieseni an altempt lo model a binding mechanism. Rather, we 
have chosen )o compare values ol' die associalion constant, a parameter 
thai represents rhe concent ral ion of polyamide at which I he binding 
sile is halt'-salurated. The binding isolhenns were normalized using 
the following equation: 



(3) 



Three sets of data were used in determining each association constant. 

The method for determining association constants used here involves 
the assumption that [L] b „ * [LJ frC( .. where [L] )iTe is the concentration of 



polyamide lice in solution i unbound). For very hi^h association 
constants ill is assumption becomes invalid, resulting in underestimated 
association constants. In the experiments described here, the ON A 
concent ral ion is estimated to be —50 pM. As a consequence, measured 
association consranls of I x- 10'' M"' and 5 * 10'' M 'underestimate 
the true association constaius by factors of approximately <i.S and 
1 .5-2, respec lively. 
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Hin Recombinase Bound to DNA: 
The Origin of Specificity in Major 
and Minor Groove Interactions 

Jin-An Feng, Reid.C. Johnson, Richard E. Dickerson* 

The structure of the 52-amino acid DNA-binding domain of the prokaryotic Hin recom- 
binase, complexed with a DNA recombination half-site, has been solved by x-ray crys- 
tallography at 2.3 angstrom resolution. The Hin domain consists of a three-a-heiix bundle, 
with the carboxyi-terrninaJ helix inserted into the major groove of DNA, and two flanking 
extended polypeptide chains that contact bases in the minor groove. The overall structure 
displays features resembling both a prototypical bacterial helix-tum-helix and the eukary- 
otic horneodomain, and in many respects is an intermediate between these two DNA- 
binding motifs. In addition, a new structural motif is seen: the six-amino acid carboxyi- 
terminaJ peptide of the Hin domain runs along the minor groove at the edge of the 
recombination site, with the peptide backbone facing the floor of the groove and side chains 
extending away toward the exterior. The x-ray structure provides an almost complete 
explanation for DNA mutant binding studies in the Hin system and for DNA specificity 
observed in the Hirwelated family of DNA invertases. 



The Hin recombinase catalyzes a DNA 
inversion reaction in the Salmonella chro- 
mosome (I, 2). This site-specific recombi- 
nation reaction controls the alternate ex- 
pression of two flagellin genes by reversibly 
switching the orientation of a promoter. 
During the process of inverting the 1-fcb 
segment of DNA, Hin proteins bind to left 
and right recombination sites (JuxL and 
hixR, respectively) located at the bound- 
aries of the invertible DNA segment- The 
hixL and htdt sites with their bound Hin 
protein then form a synaptic complex with 
a third cis-acting site, a re combinational 
enhancer, which itself is bound by two 
dimers of the 98-arnino acid Fis protein. 
Formation of this invertasome complex (3- 
5) aligns the two recombination sites cor- 
rectly and activates the Hin protein to 
initiate the exchange of DNA strands, lead- 
ing to inversion of the intervening DNA. 

Hin belongs to a family of bacterial DNA 
invertases or recombinases that includes Gin 
from phage Mu, Cin from phage PI, and Pin 
from the el4 prophage of Escherichia coiL In 
addition to sharing 66 to 80 percent se- 
quence identity between pairs of sequences, 
this family of proteins can substitute func- 
tionally for one another in each biological 
system (1). These DNA invertases most 
likely constitute an evolutionary family not 
unlike the c-type cytochromes. The avail- 
ability of DNA sequence information from 
the binding sites of all four systems makes 
the present study of Hin-DNA binding es- 

The authors are with the Molecular Biology Institute 
and Department of Bfotoaicaf Chemistry, University of 
California. Los Angeles. CA 90024. 
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the Molecular Biology Institute. 



pccialiy informative in elucidating principles 
of protein-DNA recognition. 

Hin binds to each hixL and hixR recom- 



bination site as a dimer. The final 52 amino 
acids of the two chains (Fig. IA) are in- 
volved in binding to a 26-bp recombination 
site (Fig. IB) built from two 12-bp imper- 
fect inverted repeats separated by a 2 -bp 
core region where DNA strand exchange 
occurs (6-8). The amino-terminal 138— 
amino acid "catalytic" domain is positioned 
in part over the core nucleotides. Although 
the monomeric 52-amino acid peptide by 
itself can bind to a recombination half-site 
with low-to-moderate affinity (dissociation 
constant « 10" 7 ), cooperative interac- 
tions between the amino-tcnriinal domains 
of two intact Hin molecules are required for 
high-affinity binding (K d - 10" 9 ) (8-10). 

A large body of footprinting, mutation, 
and chemical derivatiiation data has indi- 
cated features of Hin-DNA interaction 
which are distinctive to prokaryotic DNA- 
binding proteins (8-J3). Specific binding 
requires both major groove interactions in- 
volving a helix-tum-helix (HTH) a-helix 
motif and rninor groove interactions in- 
volving the sequence Gly u9 -Arg lJW -Pro ul - 
Arg M2 . If residues 139 and 140 are deleted 
from the carboxyl-terminal domain, for ex- 
ample, then sequence-specific binding tc 
DNA is abolished. As Fig. 1A shows, these 



A Amino AgiHg«n.»nr*«nf pMA.Rirv^ f>imflina rf Efrtflta 
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H* OKPRAITKH . EQEQISIUULEK . OHP . RQQLAIIF .GIG . V8TLTRYT . PASSIKXRMN 
Gki ORPPXLTKA. EQEQAGHLUVQ .GIF. RKQVALXT .DVA . LflTLTKCH . FAKRAH I ENDDRH* 
Ckl ORKFXYQEE . TCX5QMRRLLEK .GIF. RXQVAIIT . DVA . VSTLTXKF . FASSFQS 
Pfr! GRRPXLTPE . QQAQAGHLIAA . OTP . RQKVAIXT . DVG . VSTLYXRJ . PAGDK 

II I II 
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Ffc 1. Amino acid and DNA base sequences in the Hin recombinase famBy (T, 34). (A) Amino aci 
sequence of the ONA-binding domain (the 52 cartxxykerminal residues) of the Hin protein and c 
cctrespondng regions in Gin. On, and Pin. Residues in boldface are identical in ail tour sequences c 
at least in three of the (bur. a Helices 1 to 3 as located in our Hin structure analysis are marked accv 
theHn sequence. For crystaitocyaphy. thi3 Hin fragment was synthesized manually or on an ABI 430V 
synthesizer by the sc*d-pnase method as described previously (fl). (B) Base sequence of the left DM 
inversion site, hixL Nurturing is to either direction from the center of the inverted repeat Asterisks mar 
purine bases that are protected from rnethyiation by the binding of Hin. (C) The 14-cp synthetic /» 
hatf-site as cocrystatod wtm the 

the ritfit naif of the entire hixL site in (B). Bases in strand 2 are numbered separately. For ease o 

reference, note that base n in strand 1 is paired with base (32 - n) in strand 2. Base pairs wi b 

referericed by strand 1 nirriaers ^ 

PrxxphatesaKvaysarern^^ 

andG9 wriereas across the heixen the c^ 

ptosphatenaistraridl Ite 

iodinated fa purposes of muftipte isarxxphous repJacernent phase analysis. 



348 



SCIENCE • VOL 263 * 21 JANUARY 1994 



two residues also are invariant among all 
four of the DNA invertascs. 

By comparison with pairwise amino acid 
sequence identities of 66 to 80 percent in 
the entire protein sequences, the carboxyl- 
terrninal peptides shown , in Fig. I A have 
somewhat fewer identities, 49 to 62 per- 
cent, but still are obviously homologous 
proteins. The Hin interactions resemble 
those in the binding of DNA by the homeo- 
domaln in eukaryotes. Indeed, as noted by 
Affolter et aL (J 4), the amino acid se- 
quence of the Hin binding domain can be 



aligned with that of the homeodomain .of 
the Drosophikt engrailed protein with a 27 
percent sequence identity, sufficient to sug- 
gest a claw resemblance. 

We report here the 2.3 A resolution 
crystal structure analysis of the 52-amino 
acid carboxy I- terminal DNA-binding do- 
main of Hin complexed with a 14-bp DNA 
oligomer containing a half hixL binding site 
(Fig. IC). The Hin peptide forms a three— 
a-helix core with an extended chain at 
each end. The core interacts with the major 
groove of DNA, whereas the flanking ami- 



no- and carboxy U terminal chains extend 
along two regions of the minor groove. The 
carboxy i-terminai eight-residue tail of the 
Hin peptide crosses the phosphodiester 
backbone and is inserted in the minor 
groove in a manner that has not heretofore 
been encountered in DNA-protein com- 
plexes. The crystal structure provides a 
virtually complete explanation of base spec- 
ificity experiments in solution, including 
mutant studies. 

Structure of the complex. The structure 
of the Hin-DNA complex was solved by 
multiple isomorphous replacement with the 
use of three iodine derivatives (Table 1). A 
typical section of the final (2F 0 - F c ) map 
is shown in Fig. 2. One 52-amino acid Hin 
domain binds to each 14-bp DNA half-site 
(Fig. 3). DNA helices consisting of the 13 
complete base pairs are stacked end-to-end 
in the crystal, as in figure 2 of (15). The 
unpaired base T2 on strand 1 (Fig. 1Q 
swings up to make a Hoogsteen-like inter- 
action with base pair 3, G3-C29. At the 
other end, unpaired base A 16 on strand 2 is 
not denned in the electron density map and 
presumably is disordered. The a axis of the 
crystal, the direction of stacking of two 
DNA helices, has a length at room temper- 
ature of 86.4 A « 26 x 3 32 A. The 
12-base pair steps along the helix produce a 
total rotation of 407° (average 33.9° per 
step), so that the nonbonded interhelix 
junction between base pair 15 (A15*T17) 
of one helix and base pair 3 (G3*C29) of 
the next requires a reverse twist of 360° — 
407° =» -47°. 

The DNA half-site is a standard B-DNA 
helix, with the usual local variation in helix 
parameters (16, 17). The helix is relatively 
straight and not curved around the protein 
domain as in CAP, trp repressor, 434 re- 
pressor, and Met repressor (18-22), and has 
been proposed for the Fis-DNA complex 
(23, 24). However, an unusually large 
amount of DNA surface area is contacted 
by Hin. Upon binding Hin peptide, the 
DNA half-site monomer loses 1816 A 2 of 
static solvent-accessible surface area. 

The DNA half-site contains a short run 
of five AT base pairs (numbers 4 through 8) 
that could be regarded as a segment of 
A- tract DNA. Three frequent characteris- 
tics of A-tract DNA are a straight and 
unbent helix axis, narrow minor groove, 
and large propeller twist, large enough for 
the formation of bifurcated hydrogen bonds 
within the major groove between adjacent 
base pairs (25-27). However, in the Hin- 
DNA complex the minor groove maintains 
a uniform width of approximately 6.5 to 8.5 
A (minimal P-P atom separation across the 
groove, less 5.8 A for two phosphate group 
radii), rather than the 3.5 to 4.5 A typical 
of most A-tracts. Propeller twist is large all 
along the Hin-DNA complex, averaging 



Tabto 1 . Summary of oystailographic analysis. Crystals of Hin dornain/DNA complex were grown by 
vapor diffusion against 15 to 18 percent PEG150O as described elsewhere (75). The structure was 
determined initially to 22 A resolution by multiple isomorphous replacement (MIR). Heavy atom 
parameters were refined and MIR phases calculated with the program HEAVY (44), The initial MIR 
map generated after solvent flattening (45) revealed clear density for B-torm DNA and for most of 
the protein backbone density. This map was improved further by refining heavy atom parameters 
against solvent-flattened phases (46). After two additional cycles of phasing, solvent flattening, and 
heavy atom parameter refinement, the final MIR map, with mean phasing figure of merit of 0.55 for 
data between 20.0 and 22 K was used to build a model of the complex However, it still was difficult 
to fit the amino acid sequence into many regions of the map. Only after phases were extended and 
modified to 2.8 A by the method of Zhang and Main (47) did the map show clear density for side 
chains of some "marker" residues. At that point, all residues could be fitted unambiguously with the 
exception of the final eight carboxyl-terminal amino acids. Conventional positional refinement then 
was carried out to 2.8 A with X-PLOR (48, 49). To refine the model further against a new 2.3 A data 
set collected at -150*C, rigid-body refinements were carried out in successive steps to 2.5 A. After 
positional refinement and simulated annealing, the {2F 0 - FJ map was of sufficient quality to allow 
the last eight residues to be built into the minor groove of the DNA, following a clear and continuous 
density. Electron density for residues Ser 183 and Ser 1 * 4 , however, remained poorty defined. 
Refinement was extended to 23 A in four cycles of simulated annealing with X-PLOR prior to tightly 
restrained B-factor refinement At the present stage of refinement, the agreement of the atomic 
model to oystailographic data is R - 0.228 for 8.0 to 2.3 A resolution data. Coordinates have been 
deposited with the Brookhaven Protein Data Bank and are available for immediate distribution. 



Parameter Native IdU* ldC ,a IdU* + ldC ia 



Unit cell dimensions' 



a(A) 


86.4 


84.9 


85.9 


86.0 


86.6 


biA) 


84.7 


81.4 


82.6 


82.6 


83.5 


c(A) 


47.4 


44.0 


47.0 


46.4 


47.2 


Resolution (A) 


Data collection statistics 






2.8 


2.3 


32 


3.2 


3.5 


Measured reflections 


11673 


17839 


10387 


7578 


7578 


Unique reflections 


4177 


5645 


2703 


2501 


2020 


Reflections > 2a(F) 


2653 


5346 








Completeness (%) 


92.6 


80.1 


92.1 


87.5 


87.0 


>W(%) 


5.7 


5.6 


7.5 


7.8 


8.4 


Mean isomorphou3 






18.4 


16.9 


18.3 


difference* (%) . 












Resolution (A) 




Phasing statistics 










20-3.2 


20-3.2 


20-3.5 


Cullisfl factor 






0.59 


0.58 


0.54 


Phasing power§ 






1.80 


1.74 


2.42 


Resolution (A) 




Refinement 










8.0-2.3 








R f acton] 




0.228 








Reflections with F> 2xr 




5346 • 








Total number of atoms 




978 








Water molecules 


16 










Rms deviations 












Bond lengths (A) 




0.024 








Bond angles (deg.) 


3.97 











•Space group C222, with one HkvONA complex par asymmetric unit tfl" - IQI - {f)^l(l), where / is the 
observed intensity and (/> is (he averaged intensity obtained from muftipJe observations of symmetry-related 
reflections. *The mean isomorphous difference ts IQF^J - \FJ/1\F^ where |/y and |/>J are structure factor 
amplitudes of protein and heavy atom derivative of the protein, respectively. § Phasing power is the mean 
amplitude of heavy atom structure factors, F H , cflvided by £ the rccl-mearvsquare lack-of-ctosure error. The mean 
Bgure of merit. 20.0 to3J2 A is 0.55. rCrystaftograpnic R factor - X(\F a - fjytf/J where F B and F are the 
observed and calculated structure magnrtudes. respectrvery. 
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-16°, but is not systematically larger in the 
A- A- A- A- A region than elsewhere. Hence 
the five AT base pairs do not constitute a 
classical A-tract structure, perhaps because 
or* binding to Hin. 

Overall protein folding. The 5 2 -ami no 
acid DNA-binding domain of Hin consists 
of a compact bundle of three ot-helices, 
with extended amino-cerminal arm and car- 
boxyl-terminal tail (Fig. 3). a-Helix 1 
(Glu 148 to Lys 15a ) lies parallel to the axis of 
the DNA, a-hclix 2 (Arg 162 to Phe 169 ) is 
nearly anriparallei to helix 1 with an angle 
of -25° between helix axes, and ct-helix 3 
(Val 173 to Phe 180 ) is inserted in the major 
DNA groove parallel to the base pairs (not 
to the floor of the groove itself). The HTH 
motif formed by helices 2 and 3 is similar to 
those found in other prokaryoric regulatory 
DNA-binding proteins. The Hin HTH re- 
gion can be superimposed on equivalent 
regions of Fis (23, 24) and X repressor (28) 
with root-mean-square deviations in Cot 
atomic positions of only 6.61 and 0.76 A, 
respectively. 

All three a helices are amphipathic, 
with hydrophobic residues packed tightly 
against one another in a hydrophobic core 
(Fig. 3A). He 152 and Leu 156 of helix 1 
interdigitate with Leu 163 and Phe 169 of he- 
lix 2. Val 173 , Leu 176 t and Phe 180 of helix 3 
also point into the hydrophobic core, 
which is delineated by the blue polypeptide 
backbone regions in Fig. 3A. At the bot- 
tom in that view, He 144 on the amino- 
terminal arm closes the hydrophobic pock- 
et. These hydrophobic interactions appear 
to be the main forces stabilizing the folding 
of the Hin protein. They also are strongly 
conserved among the other DNA inver- 
tases, Gin, On, and Pin (Fig. 1A), sup- 
porting the inference that all four of these 
proteins are folded in the same way. Hydro- 
phobic interactions are supplemented by 
hydrogen bonds between side chains: 
Arg 162 (invariant among the four inver- 
tases), at the beginning of helix 2, is hydro- 
gen-bonded to main chain carbonyl oxy- 
gens of Phe 180 (the final residue of helix 3) 
and Pro 181 . Most charged side chains of the 
protein are either in contact with DNA or 
exposed to the solvent. 

Major groove protxin-DNA interac- 
tions. a-Helix 3 is the DNA recognition 
helix for the Hin protein; helices 1 and 2 
are too far from the DNA to permit direct 
interactions. Only Gin 163 at the amino 
terminus of helix 2 makes an indirect DNA 
contact through a hydrogen bond to residue 
Tyr 177 (invertase invariant) in helix 3, 
which in turn contacts phosphate P19 (Fig. 
3B). Five interactions between helix 3 and 
DNA backbone phosphates position the 
recognition helix properly, and two-amino 
acid side chains, Ser 174 and Arg 178 , make 
specific bonds to the edges dTEasc pairs 
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C9-C23 and A10-T22, as detailed below. It 
is significant that four of the eight amino 
acids in helix 3 are completely invariant 
among the four DNA invertascs, and an- 
other three are semi- in variant, with the 
same residue in three sequences and a close- 
ly related one in the fourth. 

The five nonspecific interactions with 
DNA backbone phosphates are depicted in 
Fig. 3B. The side chain of Tyr 177 reaches up 
to phosphate PI 9 on one edge of the major 
groove, whereas Tyr 179 on the other side of 
the a helix reaches down to phosphate P8 
directly across the groove on the other wall. 
... One of the terminal —NH 2 groups of the 
Arg 178 side chain donates a hydrogen bond 
to the remaining oxygen of phosphate P8. 
The side chain of Thr 175 and the main 
chain amide of Gly 172 anchor helix 3 even 
further by donating hydrogen bonds to 
phosphate P9. In contrast to other HTH 
DNA-binding proteins, all of these nonspe- 
cific anchoring contacts to DNA phos- 
phates are made by residues of helix 3; the 
"three-point contact** made by residues 
Gly 172 , Thr 175 , Tyr 177 , and Tyr 179 efficient- 
ly braces helix 3 against the opening of the 
major groove of DNA in a position to make 
specific recognition interactions. 

Specific base sequence recognition uses 
only two Hin side chains, Ser 174 and 
Arg 178 , and in part involves the mediation 
of water molecules (Fig. 4) in a manner that 
has been proposed for the tn? repressor (29) . 
The side chain of Ser 174 donates a hydrogen 
bond to the N-7 atom of A10. One turn of 
a helix away from this position, the termi- 
nal -NH 2 of Arg 178 that is not involved 



with P8 donates a similar hydrogen bond to 
the N-7 of G9. The Arg ,7S e-imino nitro- 
gen donates another hydrogen bond to 
bound water molecule 1, which in turn 
donates a bond to the CM of T22, essen- 
tially "reading" the fact that this base pair is 
indeed A10T22 and not a G-C pair. The 
other proton of this water molecule bonds 
to water molecule 2, which receives anoth- 
er hydrogen bond from the N-6 amine of 
A2 1 (recognizing this as an A-T pair and 
not G*C), and donates hydrogen bonds to 
N-7 of the same base and to the main chain 
carbonyl oxygen of Ser 174 . Ser 174 is invari- 
ant among all four DNA invertascs. Arg 178 
is substituted only by Lys, and when this 
happens, two basic side chains always ap- 
pear adjacent at positions 178 and 179 (Fig. 
1A), meaning that some of the hydrogen 
bonds of the Hin structure could well be 
preserved. Both of the bound waters have 
the tetrahedral geometry expected for water 
molecules, donating two hydrogen bonds 
and accepting two others. The fourth bond 
to water 1, not shown explicitly in Fig. 4B, 
must be to another water molecule not well 
localized in the electron-density map. 

All of these specific interactions are 
drawn in Fig. 4, A and B. Together they 
recognize base pairs 9, 10, and 11 of the 
half recombination site. Indeed, the bind- 
ing of Hin is particularly sensitive to alter- 
ations of base pairs 9 and 10 (13). Dimethyl 
sulfate modification of the N-7 position of 
G9 inhibits Hin binding (8, 10). Methyl- 
ation at the N-6 position of A 10 by the 
deoxyadenosine methylasc of Salmonella de- 
creases binding affinity (13). Hin binding 




Fig. Z Stereo view of the {2F 0 - electron density map al 2.3 A resolution (bJue contour*) showing 
portions of DNA base pairs 8 to 1 1 Cop) and the region of hefix 3 around Arg 1 73 and Tyr 179 (marked) 
Contour level is 1a. The protein framework is in red. and DNA is in green. 
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alio is strongly and adversely affected by 
mutations ac these sites, being inhibited by 
substitution of C at positions 9 and 1 1 or 
C cither C or T ac position 10. 



Minor groove protein-DNA interac- 
tions — the amino- terminal arm. Genetic 
and biochemical studies have demonstrated 
that, contacts made by the amino- terminal 



1 ~- 




Rg. 3. (A) Stereo diagram 
of the HirvDNA complex. 
DNA is in blue stick bonds. 
The path of the Kin poly- 
peptide chain is shown as 
a flexible tubing: orange in 
general, but blue for hydro- 
phobic residues. . Side 
chains that contact the 
DNA are drawn in orange 
and labeled in yellow. Note 
the amtno-terminal arm 
within the minor groove at 
tower right <Gry 130 -Arg 14a ), 
the cartxwyt- terminal tail in 
the minor groove at upper 
■ left (lle ,ll3 -Asn 100 ), and he- 
lix 3 nested in the major 
groove. Pink spheres are 
two bound water mole- 
cules involved in protein- 
DNA contacts within the 
recognition site. Hydrogen 
bonds are in green. (B) 
Schematic drawing of the 
complex, in the same view 
as the stereo. Helices are 
numbered 1 to 3. and key 
side chains that interact 
with the DNA are identified. 
Open dots along back- 
bone ribbons locate CV 
atoms. Phosphates 8. 9, 
and 19 are indicated spe- 
cifically on the ribbons. 
Base pairs 9 to 11 are 
shown in stereo close-up in 
Fig. 4A, base pairs 4 to 8 in 
Fig. 5B, and pairs 9 to 15 in 
Rg. 7. 
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arm of the Hin DNA-binding domain are at 
least as critical to DNA recognition as are 
those of helix 3 (JO, /2, J3). Indeed, 
merely deleting Gly 139 and Arg 140 from the 
Hin DNA-binding peptide is sufficient to 
abolish specificity of binding to hixL {12). 
These residues are invariant in all of the 
DNA invertases. 

The amino-terminal arm, Gly 139 -His 147 , 
adopts an extended conformation (Fig. 5). 
Clear electron density (Fig. 5A) allows 
Gly 139 and Arg 140 to be located unambigu- 
ously within the minor groove. The 
s-imine of the Arg 140 side chain donates a 
hydrogen bond to N-3 of A26. The unusu- 
ally high 26° propeller twist of base pair 6 
(T6-A26) permits a second hydrogen bond 
from the main chain amide of Arg 140 to the 
Q-2 of T6. if a G*C pair were to be 
substituted, this latter bond would become 
impossible, and the N-2 amine of guanine 
would push the Arg 140 side chain away. 
Although the neighboring A-T base pairs 
are less propeller-twisted, the ability of an 
A'T pair to adopt such a large propeller 
contributes to the recognition process (25- 
27). Gly 139 rests in close van der Waals 
contact with base pair 5; the main chain 
Cot atom of that residue is only 3.4 A from 
the 0-2 of T5 and 4-1 A from the C-2 of 
A27. Introduction of an amine group at 
that locus, as in guanine, would push the 
Hin polypeptide chain up and away from 
the floor of the minor groove .at that point. 
Each base pair substitution of A -T by G-C 
at positions 5 aid 6 abolishes the binding 
affinity of Hi^fl|/. Indeed, A-T base pairs 
at positions 5 artd 6 are universally present 
in all of the recombination sites of various 
enteric inversion systems: fieri, and hixR; 
gjxL and gixR; axL and cixR; and pixL and 
ptxR (I) (Fig. 6). Biochemical footp tinting 
experiments also show that both intact Hin 
and the Hin peptide protect adenines 5 and 
6 from methylarion (10). 

Pro 141 arches across one wall of the 
minor groove, and the e-imirto of Arg 142 is 
hydrogen-bonded to phosphate P8, an in- 
teraction that may be important in direct- 
ing the amino-terminal arm into the minor 
groove of the DNA. Hin, Gin, Cin, and 
Pin all have Pro at position 141 or 142, 
followed immediately by a basic Arg or Lys. 
A significant role may be played by He 1-44 , 
which interacts with the hydrophobic core 
formed by packing helices 1 to 3 against one 
another. He 144 may restrict movement of 
the amino-terminal arm, thus bringing 
Arg 142 into proximity to phosphate P8. In 
other DNA invertases, position 144 is al- 
ways a bulky hydrophobic side chain, cither 
Leu or Tyr (Fig. 1A). 

Minor groove protein-DNA interac- 
tions — the carboxyl- terminal tail. The car- 
boxy 1-terminal tail of the Hin polypeptide 
crosses the phosphodicster backbone at the 
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C23 



Fig. 4. (A) Stereo close-up of the interaction between DNA and helix 3 in the 
major groove, viewed approximately down the DNA helix axis. Base pair 
T1 1 -A21 is nearest the viewer, with A10-T22 beneath, and G9-C23 farthest 
away. Strand 1 backbone continues to lower right through P9 and P8. Helix 
3 is drawn as a smooth curve, with specific depiction, from top to bottom, 




SeriU 



T22 



ArjJTa 



of residues Gfy 172 . Ser 17 \ Thr 17S , Arg ,7 », and Tyr ,TO . Two shaded spheres 
are bound water molecules. (B) Schematic of specific base pair recognition 
involving Ser 174 , Arg 17a , and two bound water molecules. Along base 
edges, "a" marks a hydrogen bond acceptor (ring N-7 or carbonyi 0-4 or 
06) and "d"' marks a hydrogen bond donor (N-4 or N-6 amine group). 



outer edge of the recombination site and 
then follows the minor groove back toward 
the center of the 13-bp DNA helix (Figs. 3 
and 7). The final six residues of the Hin 
polypeptide, ne l85 -Lys l86 -Lys Ia7 -Arg lM - 
Met 189 -Asn 190 , adopt an extended confor- 
mation and lie within the minor groove, 
but the side chains themselves make no 
contacts with the floor of the groove. In- 
stead, they point outward, with the poly- 
peptide backbone resting against base edg- 
es. At the point where the final six-amino 
acid residues dip into the minor groove, the 
main chain CO of He 185 hydrogen bonds to 
the N-2 of G14. The main chain NH of 
Lys 187 bonds to the 0-2 of T20, and a little 
farther along, the main chain amide of 
Asn 190 interacts with the 0-2 of T22 while 
the side chain of the. carboxyl- terminal 
residue bonds to the N-3 of A 10. These 
interactions may be responsible for the large 
propeller twist and roll angles of base pairs 
10 and 12 and the -16° bend of DNA 
toward the major groove. Consistent with 
the interactions just discussed, the N-3 of 
A10 is partially protected from dimethyl 
sulfate attack by Hin binding (10). In 
addition, Mack ttaL (11) have noted that 
a Hin peptide lacking the last eight resi- 
dues, when modified with EDTA- Fe, 
cleaved DNA with reduced efficiency as 
compared with a peptide containing the 
complete carboxyl terminus. This portion 
of the chain is variable among the DNA 
invertases: whereas Hin has six final amino 
acids, Gin has ten, Cin has three, and Pin 
has a lone Lys (Fig. 1 A). This carboxyl- 
terminal tail is presumably supportive but 
not essential. 

The hydrogen-bonded extension of the 
last six residues along the floor of the minor 
groove recalls AT-specific binding of minor 
groove drugs such as netropsin and distamy- 
cin (30, 31). Such binding involves an 
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element of base specificity: if any of the base 
pairs 10 to 13 were G-C rather than AT, 
then the tail of the Hin peptide would be 
pushed away from the floor of the groove. 
In another context, it has been proposed 
(32, 33) that an extended polypeptide con- 
taining repeats of SPKK (34) sequence may 
interact with DNA minor groove in a sim- 



ilar fashion to netropsin, with main chain 
amide nitrogen forming hydrogen bonds 
with base pairs in the minor groove. Our 
structure seems to provide a concrete exam- 
ple of such a model. 

The molecular basis of* specificity. Two 
aspects of the Hin system make it especially 
conducive to an understanding of the cf- 
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Fig. 5. Stereo views of the arruno-lerminai arm of the Hin peptide, in the minor groove of DNA. (A) 
The refined (2F Q - electron density map (blue framework) with minor groove vertical, showing 
Arg'«°.p ra '*'-Arg'« looping over the phosphate backbone toward the right. Protein is in red and 
0NA is in green. (B) View along the minor groove, from the top in (A), showing the entire "A-tracT 
region, from T4-A28 at bottom through T8-A24 at top. This view is approximately thai of the lower 
half of Fig. 38. The ribbon extending from center toward upper right is the Hin peptide region 
Gry' M -Arg ,40 -Pro ,4, -Arg ,4a . with side chains drawn explicitly. 
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Fig. 6. Base sequences in 
enteric bacterial inversion 
sites: the left and right inver- 
sion sites from the Hin inver- 
sion system of Salmonella, 
the Gin inversion elements 
from bacteriophage Mu. Cin 
from phage P1, and Pin 
from the el4 prophage of 
Escherichia cofi (7). Only 
one chain from each com- 
plex is shown; the other is 

cornplernentary as in Fig. 

1B. Each of the eight sites is built from two roughly syrnmetricaJ haif-sites. Asterisks at bottom 
indicate positions that are especially important in site recognition by invertases. 



NxL 
hixR 
$*t 
gixR 
dxL 
dxR 

PW- 
pixR 



Left hatf-sfta Right hatf-site 

-11 -6 -1*1 *6 *n 

5 ' -T-T-C-T-T-C-A-A-A-A-C-C-A-A-C-C-T-T-T-T-T-O-Jk-T-A-A-3 ' 
5 ' -'-T-fr-T-C-C-T-T-T-T-G-C-A-A-O-G-T-T-T-T-T-Q-A-l-A-A-J ' 
5 ' -T-T-C-C.»-C-T-A-A-A-C-C-G-A-C-G-T-*-t-T-C-0-A-T-A-A-3 ' 
5 ' -T-^-C-C-^-O-T-A-A-A-C-C-a-A-C-G-T-M-T-O-O-A-T-A-A-3 ' 
5 ' -T-T^C-T-C-T-T-A-A-A-C-C-A-A-<I-<3-T-T-T-A-G-a.A-T-'r-0-3 ' 
5 ' -T-T-C-T-C-T-T-A-A-A-C-C-A-A-O-O-T-A-T-T-O-O-A-T-A-A-3 ' 
5 * -T-T-C-T-C-C-C-A-A-A-C-C-A-A-C-C-T-T-T-T-C-O-A-O-A-O- 3 ' 
5 ' -T-T-C-T-C-C-C-A-A-A-C-C-A-A-C-C-T-T-T-A-T-O-A-A-A-A-3 * 



fects of sequence on specificity. The first is 
the availability of binding data on 39 dif- 
ferent base substitution mutations generat- 
ed by Hughes et d. (13). They constructed 
a symmetrized hixC sequence in which the 
left half is given the complementary se- 
quence to the right half shared by both hixL 
and hixR and established that this symme- 
trized hixC binds Hin fully as well as the 
wild-type hixL and hixR, (It is the hixC 
sequence that we used for crystallographic 
analysis.) They then constructed an ex- 



haustive set of symmetrical mutants in the 
two halves of hixC, varying each of the 13 
positions among all three of the other bases. 
Hence we have comp lete inform a nor. ^Uir 
t he s tr ength of Hin bi nding with ey ^ry 
possible single-base change in the _Qflilmal 
toU sequence . The second favorable aspect 
is the existence of four homologous DNA 
inversion systems: Hin, Gin, Cin, and Pin. 
Taken together, these provide 8 complete 
recombination sites or 16 binding half-sites 
(Fig. 6). How far can our x-ray analysts of 



****** frequency of occurrence of bases at key positions in bacterial inversion sites. Sequences 
are read in a 5'-to-3' direction from the center of the inversion site as shown in Ftg. 6. Hence for left 
naif-sites the other chain, not shown in Fig. 6. is tabulated. Allowed substitution data derive from the 
frequency of lysogenization in a P22 challenge phage assay at 100 aM lsorjrcoyr>^tWoaaJacto- 
Pyranoside (IPTG) concentration, table 1 of (13). ^ 



Position 



Natural hix. gix, ax, and pix sites 





G 


A 


T 


5 




2 


14 


6 




1 


15 


9 


13 


3 




10 


2 


14 




11 


6 


2 


6 


12 




15 


1 


13 


2 


14 



Acceptable 
mutations in 
symmetrized 
teCsite (13) 



DNA 
groove 



Not G, not C 
Not G, not C 
NotC 

Not T, not C 

NotC 

NotC 

All equivalent 



Minor 
Minor 
Major 
Major 
Major 
Minor 
Minor 



the Hin- DNA complex account for this 
wealth of data, and provide a molecular 
basis for DNA-protein specificity? 

The Hin- DNA crystal structure shows 
that all three components of the Hin pep- 
tide, the amino-tenninal arm, the HTH 
region, and the carboxyl- terminal tail, con- 
tribute to base sequence recognition. The 
phosphate backbone contacts of helix 3 
help position Hin on the DNA, but one 
could easily imagine that Hin could slide 
along the DNA in a nonspecific manner 
until it encounters the correct local base 
sequence. 

Interactions of Hin residues Ser m and 
Arg 178 , both direct and through intermedi- 
ate water molecules (Fig. 4B) place restric- 
tions on base pairs 9, 10, and 11. Some 
latitude in base sequence is possible if dif- 
ferent arrangements of hydrogen bond do- 
nors and acceptors are permitted. Those 
rearrangements that are possible without 
losing the total number of hydrogen bonds 
are shown in Fig. 8. Base 9 is restricted to 
being a purine (G or A) by virtue of the 
hydrogen bond donated to ring atom N-7. 
In complete agreement with this model, of 
the 16 half-sites shown in Fig. 6 and listed 
in Table 2, G occurs 13 times at position 9 
and A occurs 3 times. No pyrirnidine is ever 
found at that locus. Replacement of G at 
position 9 by A or T (and C at position -9 
by T or A) in the mutant studies of Hughes 
ct aL (13) is acceptable, but C at position 9 
reduces binding significantly. Our crystal 
structure shows why: G, A, or T at position 
9 offer a hydrogen bond acceptor to Arg 178 , 
whereas C offers a N-4 amine donor in- 
stead, and hence is disfavored. Even T is 
less favorable, because it positions the hy- 
drogen bond acceptor differently and par- 
tially blocks it with its own C-5 methyl 
group. 

Position 10 also must be a purine for the 
same reason as 9. The choice in 14 of the 




Linear extension of carrjoxyMerminai residues 185 to 190 down the 
tefLaiS 0 ?^* Steteo fe P resentatlon ~ DNA base pair G9-C23 is at 
<■ and A15-T17 at right Amino acid side chains are identified in (B) 



which is a sketch from the same orientation. The main polypeptide chain 
atoms are in black dots; side chains are in open circles. 
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16 half-site sequences is A10, resulting in 
. T22 at the other end of the base pair, with 
a hydrogen bond-accepting 0-4 atom (Fig. 
8A). However, G10-C22 is an acceptable 
minority choice in two sequences, and in 
that case the N-4 amine of C22 must 
donate a hydrogen bond to water 1 (Fig. 
8B). Water 1 then would donate a hydro- 
gen bond to another water molecule not 
shown here. 

Base pair 1 1 is more variable than might 
have been expected, and for an interesting 
reason. Water molecule 2 in Fig. 8 A accepts 
a hydrogen bond from the N-6 of A21 and 
donates a bond to N-7, but all that is 
required for a hydrogen balance is that this 
water molecule should donate one hydrogen 
and accept another. The two base atoms 
could just as easily be thymine 0-4 and 
adenine N-6 as in Fig. 8A or cytosine N-4 
and guanine 0-6 as in Fig. 8B. The only 
combination not permitted would be gua- 
nine at position 2 1 , with dual acceptors N-7 
and 0-6. For in that case, water 2 would not 
have enough protons to form the bond with 
the main chain carbonyl of Ser 17 \ In other 
words, the requirement on the strand 1 side 
of base pair 11 is "not-C," and indeed this 
requirement is borne out in Table 2 both by 
the invertase family sequence comparisons 
and mutational substitutions. 

The direction of the hydrogen bond 
between water molecules — water 1 donat- 
ing to water 2 — actually is firmly estab- 
lished. As Fig. 8C shows, if water 2 were to 
donate a bond both to water 1 and to the 
Set 174 main chain, carbonyl, then the two 
positions on base pair 1 1 would have to be 
adjacent donors, and no base pair shows 
this behavior. The full pattern of hydrogen 
bonds can be maintained only by arrange- 
ments as in Fig. 8, A and B. 

A'T base pairs are favored at positions 
12 and 13 because of a netropsin-like ex- 
tension of the carboxyl-terminaJ six Hin 
residues down the floor of the minor groove 
(Fig. 7). Table 2 shows that AT pairs 
indeed are ovemhelmingly fevorcd at these 
two loci, although mutant studies show 
position 13 to be more r^ennissive. It could 
be that the similarity of sequences at this 
point among all of the enteric recombinases 
is a matter of evolutionary divergence from 
a common ancestral sequence, rather than . 
convergence on function. 

At the other end of the recognition 
domain, positions 4, 5, and 6 universally 
are AT base pairs in all of the DNA 
inversion sites, and G*C pairs are strongly 
disfavored at positions 5 and 6 in the 
mutant studies. The x-ray structure shows 
the reason: Hin residues Gly 13 * and Arg ,4 ° 
are so intimately linked to the floor of the 
minor groove that there simply is no room 
for the C-2 amine group of guanine. As 
noted above. Cry 139 and Arg 140 are abso 
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lutely essential for sequence-specific bind- 
ing of Hin to its DNA site. 

In summary, the recognition element of 
the enteric inversion half-sites appears to 
involve two A-T base pairs (5 and 6) 
recognized by amino acid residues Gly 139 
and Arg 1 * 3 in the minor groove, two non- 
specific base pairs (7 and 8), and then a five 
base sequence (9 to 13) recognized by helix 
3 and the carboxy 1- terminal tail in major 
and minor grooves, respectively. The opti- 
ma! binding sequence is shown in Fig. 9. 
The Hin dimer has - 100-fold higher bind- 
ing affinity to the full recombination site 
than does the Hin peptide binding to a 
half-site, and hence cooperative interac- 
tions by the Hin dimer may also contribute 
to recognition. 

Hin-DNA versus other HTH DNA- 
binding complexes. The HTH motif occurs 
frequently in DNA-binding proteins, in- 
cluding prokaryotic regulators (18, 20, 28, 
35), eukaryotic homeodomains (36-38), 
eukaryotic transcription factors such as 
HNF-3y (39), the Oct-1 POU-specific do- 
main, POUs (40), the third tandem repeat 
of the cMyb protein (41), and the globular 
domain of histone H5, CHS (42). Com- 
plexes of these proteins with DNA show 
two principal variants that are represented 
in Fig. 10 by the X repressor and the 
engrailed homeodomain. In all cases, rec- 
ognition helix 3 fits into the major groove 
and helix 2 runs across the width of the 
groove. In homeodomain structures, helix 1 

SCIENCE • VOL. 263 • 21 JANUARY 1994 



lies essentially antiparallel to helix 2, which 
also spans the width of the major groove. 
The residues preceding helix 1 are posi- 
tioned to interact with the minor groove, 
and recognition helix 3 is oriented along 
the floor of the major groove. This pattern 
is persistent; the cluster of three helices is 
virtually guperimposable from one home- 
odomain complex to the next. In contrast, 
prokaryotic regulatory proteins (18, 20, 28, 
35) have similar dispositions of helices 2 
and 3, but with the exception of trp repres- 
sor, helix 1 is swung out and away from the 
DNA (Fig. 10A). Helix 3, on the other 
hand, tends to lie parallel to the edges of 
base pain in the prokaryotic regulators, 
rather than along the floor of the groove as 
in homeodomains. 

The present Hin-DNA complex is inter- 
mediate between these two structures. Rec- 
ognition helix 3 is parallel to base pair edges 
rather than to the groove itself, as with 
prokaryotic repressors, but helix 1 is nearly 
antiparallel to 2, with its amino terminus 
extending toward the minor groove where a 
nonhelical chain continuation contributes 
interactions essential to specific recogni- 
tion. The alignment of helix 3 parallel to 
base pairs in Hin and repressors is function- 
al: comparison of the structures of 434 
repressor-operator, 434 Go-operator, X re- 
pressor-operator, and CAP-DNA complexes, 
shows that amino acids at positions 1, 2, 
and 6 of the helix make specific contacts 
with bases in the major groove in each case 
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Engrafled homeodomain 

Fig, 10, Comparative interactions of a three-helix unit with the DNA double helix in the X repressor- 
operator complex (tort), the Hin-ONA compiex (canter), and the engrailed homeodomain complex 
^ - (right). In each case, the cart»xyMerrninal hefix of the three is inserted into the major groove. 



(43). Similarly, Hin uses positions 2 and 6 
(Ser 174 and Arg 178 ) for its primary recogni- 
tion process. 

The disposition of helix 1 with respect 
to the DNA is not completely identical in 
Hin and homeodomains; in the latter, helix 
1 crosses perpendicular to the walls of the 
major groove, whereas Hin has helix 1 
aligned parallel with the overall DNA axis. 
The two loops connecting helices are short- 
er in Hin than in the homeodomain pro- 
teins and more like those of prokaryotic 

^ repressors. In addition, the a helices them- 
selves are shorter than their homeodomain 
equivalents, especially helix 3, which in 
both yeast al and DrosophUa engrailed ho- 
meodomains are 17 amino acids long. 
There also is a remarkable difference in 

q minor groove binding by the sequence Arg- 
Pro- Arg in the amino- terrninal arm of Hin 
and in the engrailed homeodomain. In 
Hin, Arg 140 makes specific base contacts, 
whereas Arg M2 anchors the two-pronged 
fork by binding to the phosphate backbone 
°' DNA. In the engrailed homeodomain, 

(2 Arg^Pro^Arg 5 inserts both Are side chains 
into the minor groove and makes specific 
base contacts, and it is the adjacent Thr* 
«wt interacts with phosphate backbone, 
rhus, the same three— amino acid recogni- 
tion module can interact with the minor 
groove in two profoundly different ways to 

C Iterate contacts essential for protein- 
DNA binding. - 

In another significant difference, the 
amino- terminal arms of homeodomain pro- 



teins run along the minor groove in the 
direction of the HTH unit itself, whereas 
the ammo-terminal arm of the Hin protein 
runs in an opposite direction (Figs. 3A and 
9), toward what would be the center of the 
recombination site. Model-building of an 
intact recombinase bound to a complete hoc 
site suggests that the positions cleaved by 
Hin during recombination are located on 
the opposite face of hxx from the DNA- 
binding domains. It is likely that residues 
preceding the amino- terminal arm within 
an intact Hin protomer may continue run- 
ning along the minor groove around the 
DNA and link with the catalytic domain 
that presumably is positioned over the 
cleavage site. 
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contains large identifiable cells and is 
consequently, like Hirudo and Aplysia, 
a favourite preparation for studying 
neural mechanisms at the cellular level, 
and in particular for studying isolated 
neurons in culture. 

helix-coil transition See random coil. 

helix-destabilising proteins (single- 
stranded binding proteins) Proteins in- 
volved in DNA replication. They bind 
cooperatively to single-stranded DNA, 
preventing the reformation of the 
duplex and extending the DNA back- 
bone, thus making the exposed bases 
more accessible for base pairing. 

helix-loop-helix A motif associated with 
transcription factors, allowing them to 
recognise and bind to specific DNA 
sequences. Two a-helices are separated 
by a loop. Examples: myoblast MyoDl, 
c-myc, Drosophila genes daughterless, 
hairy, twist, scute, achaete, asense. Not the 
same as helix-rum-helix. 

helix-turn-helix A motif associated with 
transcription factors, allowing them to 
bind to and recognise specific DNA 
sequences. Two amphipathic a-helices 
are separated by a short sequence with a. 
p-sheet. One helix lies across the major 
groove of the DNA, while the recogni- 
tion helix enters the major groove, and 
interacts with specific bases. An example 
in Drosophila is the homeotic genefushi 
tarazu, that binds to the sequence 
TCAATTAAATGA. Not the same as 
helix-loop-helix. 

helodermin Seeexendin. 

helospectin See exendin. 

helper factor A group of factors appar- 
ently produced by helper T-lymphocytes 
that act specifically or non-specifically to 
transfer T-cell help to other classes of 
lymphocytes. The existence of specific 

. T-cell helper factor is uncertain. 

helper T-cell See T-helper cells. 

helper virus A virus that will allow the 
replication of a co-infecting defective 
virus by producing the necessary pro- 
tein 



hema-, hemo- See haema, haemo. 
heme See haem. 

hemicellulose Class of plant cell-wall 
polysaccharide that cannot be extracted 
from the wall by hot water or chelating 
agents, but can be extracted by aqueous 
alkali. Includes xylan, glucuronoxylan, 
arabinoxylan, arabinogalactan II, 
glucomannan, xyloglucan and galacto- 
mannan. Part of the cell-wall matrix. 

hemidesmosome Specialised junction 
between an epithelial cell and its basal 
lamina. Although morphologically sim- 
ilar to half a desmosome (into which 
intermediate cytokeratin filaments are 

\ also inserted), different proteins are 
involved. 

hemizygote Nucleus, cell or organism 
that has only one of a normally diploid 
set of genes. In mammals the male is 
hemizygous for the X-chromosome. 

Hensen's node (primitive knot) Thicken- 
ing of the avian blastoderm at the 
cephalic end of the primitive streak. 
Presumptive notochord cells become 
concentrated in this region. May well be 
a source of retinoic acid that is acting as a 
morphogen in the developing embryo. 

heparan sulphate (glycosaminoglycan) 
Constituent of membrane-associated 
proteoglycans. The heparan sulphate- 
binding domain of NCAM is proposed to 
augment NCAM-NCAM interactions, 
suggesting that cell-cell bonds mediated 
by NCAM may involve interactions 
between multiple ligands. The putative 
heparin-binding site on NCAM is a 28 
amino acid peptide shown to bind both 
heparin and retinal cells, as well as to 
inhibit retinal cell adhesion to NCAM. 
This strengthens the argument that that 
this site contributes directly to NCAM- 
mediated cell-cell adhesion. 

heparin Sulphated mucopolysaccharide, 
found in granules of mast cells, that 
inhibits the action of thrombin on fibrin- 
ogen by potentiating anti-thrombins, 
thereby interfering with the blood- 
clotting cascade Platelet factor IV will 
neutralise heparin 
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zontal transmission, long incubation 
periods and chronic progressive phases. 
Visna virus is in this group, and there are 
similarities between visna, equine infec- 
tious anaemia virus and HIV. 

lentoid Spherical cluster of retinal cells, 
formed by aggregation in vitro, that has a 
core of lens-like cells inside which accu- 
mulate proteins characteristic of normal 
lens. The cells concerned derive from 
retinal glial cells. 

Lepore haemoglobin Variant haemo- 
globin in a rare form of thalassaemia; 
there is a composite 8-P chain as a result 
of an unequal crossing-over event. The 
composite chain is functional but syn- 
thesised at a reduced rate. 

leprosy Disease caused by Mycobacterium 
leprae, an obligate intracellular parasite 
that survives lysosomal enzyme attack 
by possessing a waxy coat. Leprosy is 
a chronic disease associated with 
depressed cellular (but not humoral) 
immunity; the bacterium requires a tem- 
perature lower than 37°C, and thrives 
particularly in peripheral Schwann cells 
and macrophages. Only humans and the 
nine-banded armadillo are susceptible. 

leptonema See leptotene. 

Leptospira Genus of spirochaete bacteria 
that cause a mild chronic infection in rats 
and many domestic animals. The bac- 
teria are excreted continuously in the 
urine, and contact with infected urine or 
water can result in infection of humans 
via cuts or breaks in the skin. Infection 
causes leptospirosis or Weil's disease, a 
type of jaundice, which is an occupa- 
tional hazard for sewerage and farm 
workers. 

leptospirosis Weil's disease, caused by 
infection with Leptospira. 

leptotene Classical term for the first stage 
of prophase I of meiosis, during which 
the chromosomes condense and become 
visible. 

Lesch-Nyhan syndrome A sex-linked 
recessive inherited disease in humans 
that results from mutation in the gene for 



the purine salvage enzyme HGPRT, 
located on the X-chromosome. Results in 
severe mental retardation and distress- 
ing behavioural abnormalities, such as 
compulsive self-mutilation. 

lethal mutation Mutation that even- 
tually results in the death of an organism 
carrying the mutation. 

LETS (large extracellular transforma- 
tion /trypsin-sensitive protein) Origi- 
nally described as a cell surface protein 
that was altered on transformation in 
vitro; now known to be fibronectin. 

Leu enkephalin A natural peptide 
neurotransmitter; see enkephalins. 

leucine (leu; L; 2-amino-4-methyl- 
pentanoic acid; 131 D) The most abund- 
ant amino acid found in proteins. Con- 
fers hydrophobicity and has a structural 
rather than a chemical role. See Table 
A2. 

leucine aminopeptidase An exopep- 
tidase that removes neutral amino acid 
residues from the N-terminus of pro- 
teins. 

leucine zipper Motif found in certain 
DNA-binding proteins. In a region of 
approximately 35 amino acids, every 
seventh is a leucine. This facilitates 
dimerisation of two such proteins to 
form a functional transcription factor. 
Examples of proteins containing leucine 
zippers are products of the proto- 
oncogenes myc, fos and jun. See also 
AP-1. 

leucinopine (dicarboxypropyi leucine) 
An analogue of nopaline found in crown 
gall tumours (induced by Agrobacterium 
tumefasciens) that do not synthesise 
octopine or nopaline. 

leucocidin Exotoxins from staphylococ- 
cal and streptococcal species of bacteria 
that cause leucocyte killing or lysis. 

leucocyte (USA leukocyte) Generic term 
for a white blood cell. See basophil, 
eosinophil, lymphocyte, monocyte, neu- 
trophil. 

leucocytosis An excess of leucocytes in 
the circulation. 
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Z scheme of photosynthesis A sche- 
matic representation of the light reac- 
tions of photosynthesis, in which the 
photosynthetic reaction centres and elec- 
tron carriers are arranged according to 
their electrode potential (free energy) 
in one dimension and their reaction 
sequence in the second dimension. This 
gives a Z shape, the two reaction centres 
(of photosystems I and II) being linked 
by the photosynthetic electron transport 
chain. 

Z-disc Region of the sarcomere into 
which thin filaments are inserted. Loca- 
tion of a-actinin in the sarcomere. 

Z-DNA See DNA. 

zeatin A naturally occurring cytokinin, 
originally isolated from maize seeds. Its 
riboside is also a cytokinin. 

zebrafish Brachydanio rerio; fish with a 
transparent embryo making it possible 
to follow progeny of single cells until 
quite late stages of development. This, 
together with the availability of mutant 
lines, makes it an important preparation 
for the study of vertebrate cell lineage. 

zeta potential The electrostatic potential 
of a molecule or particle, e.g. cell meas- 
ured at the plane of hydrodynamic 
slippage outside the surface of the mole- 
. cule or cell. Usually measured by electro- 
phoretic mobility. Related to the surface 
potential and a measure of the electro- 
static forces of repulsion the particle or 
molecule is likely to meet when encoun- 
tering another of the same sign of charge. 
See cell electrophoresis. 

Zigmond chamber See orientation 
chamber. 

zinc An essential "trace" element being 
an essential component of the active site 
of a variety of enzymes. Zn 2+ has high 
affinity for the side-chains of cysteine 
and histidine. Zinc is present in tissues at 



levels of ca. 0.1 mM; but intracellular 
levels, must be much lower. 

zinc finger A motif associated with 
DNA-binding proteins. A loop of 12 
amino acids contains either two cysteine 
and two histidine groups (a "cysteine- 
histidine" zinc finger), or four cysteines 
(a "cysteine-cysteine" zinc finger), that 
directly coordinate a zinc atom. The 
loops (usually present in multiples) 
intercalate directly into the DNA helix. 
Originally identified in the RNA 
polymerase III transcription factor 
TFII1A. 

zipper See leucine zipper. 

zippering Process suggested to occur in 
phagocytosis in which the membrane of 
the phagocyte covers the particle by a 
progressive adhesive interaction. The 
evidence for such a mechanism comes 
from experiments in which capped 
B-ceils are only partially internalised, 
whereas those with a uniform opsonis- 
ing coat of anti-IgG are fully engulfed. 

ZO-1 High molecular weight protein 
(225 kD in mouse, 210 kD in MDCK cells) 
associated with zonula occludens (tight 
junction) in many vertebrate epithelia. 
Cingulin, which is distinct, is found in 
the same region. 

zona pellucida A translucent non- 
cellular layer surrounding the ovum of 
many mammals. 

zone of polarising activity The small 
group of mesenchyme cells in avian limb 
buds that is located at the posterior 
margin of the developing bud and that 
produces a substance, possibly retinoic 
acid, which provides positional informa- 
tion to the developing limb bud. 

zonula adhaerens Specialised intercel- 
lular junction in which the membranes 
are separated by 15-25 nm, and into 
which are inserted microfilaments. Sim- 
ilar in structure to two apposed focal 



